We are recruiting two PhD students for our funded NSF BIGDATA and IIS grants in the College of Computing and Informatics at Drexel University.

The students will involve in interdisciplinary projects on data science and healthcare informatics to develop advanced machine learning and distributed computing algorithms for predictive analytics of 2 petabyte of driving data and other healthcare data.

There are four goals in the projects: (1) pre-processing of a large volume of heterogeneous data, including feature selection, data cleansing, de-duplication, filtering, format conversion, data fusion and normalization, (2) developing scalable predictive analytics algorithms, including case-base reasoning, heterogeneous network mining, and tensor decomposition, (3) developing novel distributed computing infrastructure, including distributed data storage based on Spark, inverted indexing and distributing searching, and asynchronous parallel graph mining, and (4) developing a system prototype.

The successful candidate is expected to have a BS or MS (preferred) degree in computer science, information science, or a related field. Candidates are expected to have some previous experience in machine learning and distributed computing, and willingness to participate in interdisciplinary research.

Students who are interested in the position should provide (1) CV with a list of publications (if any), (2) contact information of two referees, (3) a brief description of previous research experience and future research interests, and (4) a copy of BS and/or MS transcript.

Please email to chris.yang@drexel.edu, specify "PhD application: Data Science and Healthcare Informatics" in the subject of the email, and attach all supporting documents as pdf files before December 31, 2017.

Information is also available at DBWorld and KDnuggets.