Principal Scientific Researcher
Developing novel computational methods for processing and analysing DDA and DIA data. Operationalizing high throughput statistical & QC workflows and visualization dashboards
Aspiring Full Stack Data Scientist. Currently based out of San Francisco, California. Developing software tools for Mass Spec and NGS proteomics. 5 years of cumulative experience using the Data Science stack of R-Shiny, Python, SQL, Spark, h2o.ai, Keras, javascript.
I enjoy translating business needs to working solutions.
MSstatsSampleSize is an open-source R package and the MSstats Sample Size Estimator is a R-Shiny Web Application developed around it. It allows researchers to design optimal MS-based proteomics experiments in terms of statistical power and the use of resources. In particular, MSstatsSampleSize uses protein-level data from a prior MS-based proteome investigation as a basis to plan future experiments with similar methodologies.
GitHubA comparative study of classical time-series forecasting methods such as ARMA, ARIMA with STL decomposition Gaussian mixture models. Interaction between hours of the day and days of the week provided the most reliable predictions using the gaussian mixture models.
GithubAn attempt to use sequence-to-sequence bidirectional RNNs to tackle the problem of abstractive text summarization. We also explore an unsupervised way to extract summaries initially which we further use in training our neural model.
GithubA Simple web-scrapper built using the BeautifulSoup library in Python to extract data from Indeed given arguements such as Title, Position and Experience Level. The scraped data is cleaned and stored in a MySQL database. A short summary of the extracted information is reported via email using mailR library and R scripting
Developing novel computational methods for processing and analysing DDA and DIA data. Operationalizing high throughput statistical & QC workflows and visualization dashboards
Supporting research and development of the MSstats ecosystem as a part of Vitek-Lab. Developing simple and intuitive web interfaces to help Statistcians and Proteomic Researchers desgin experiments and test hypothesis. Implementing and packaging Machine Learning algorithms
Worked with the cross functional Energy Markets team, learning and supporting the various processes which included but not limited to Enrollement, Nomination and Settlement of customers in the various Demand Response programs. Setting up various internal analytics tools for monitoring customer load
Operationalized analytic and scheduling systems for the 1000 Journeys projects, wearing various hats in the very early days Affect Mental Health. Worked with the interview teams to determine the best data collection strategies