Skip to main content

Research Assistant (Biophysics)

Florida State University

  • Designed a framework to read a variety of structured and unstructured protein databases using Python, Pyspark (from Apache Spark) to enable us to make side-by-side comparisons of how other labs study how proteins move, and thereby test the effectiveness of our own methods.
  • Used framework to pipeline protein data where it was queried in Spark-SQL, enabling rapid performance on large datasets.
  • Ensured code quality with pre-commit, black, isort, flake8 and Github Actions
  • Wrote Python libraries for serializing/deserializing PDB files and transforming protein data as Pandas Dataframes, enabling other teams to work with protein data in an analytics-friendly format (such as Apache Parquet).
  • Presented research finding to department faculty in the form of a public lecture.