Description

With the spread of electronic health records and increasingly low cost assays for patient molecular data, powerful data repositories with tremendous potential for biomedical research, clinical care and personalized medicine are being built. But these databases are large and difficult for any one specialist to analyze. This course introduces methods for finding the hidden associations within the full set of data. This course includes a programming project.

Prerequisites

CS106A
Highly recommended: STATS216

Topics include

Topics Include

  • Methods for data-mining at the internet scale
  • Handling of large-scale electronic medical records data for machine learning
  • Methods in natural language processing and text-mining applied to medical records
  • Methods for using ontologies for the annotation and indexing of unstructured content as well as semantic web technologies

Course Availability

The course schedule is displayed for planning purposes – courses can be modified, changed, or cancelled. Course availability will be considered finalized on the first day of open enrollment. For quarterly enrollment dates, please refer to our graduate education section.