Mathematics and Statistics
“Quantifying differences between image feature extraction tools”
Over the past several decades, microscopy has been one of the main tools in cell biology. Digital assessment of microscopy images has continued to serve major role in cell biology, where biologists analyze cell state and expression via software-based image measurements of cellular regions of interest. With the increasing amount of microscopy image data during biological experiments, qualitative visual inspection of images has to be replaced with quantitative automated image-based measurements in order to derive knowledge about cells with high confidence. Automation of software-based image measurements (image features) has led to a plethora of implementations. While software libraries contain thousands of implementations, many of them implement the same image measurements and sometimes result in a different numerical value. This poses a challenge for deriving knowledge about cells and for reproducing scientific results. The overall goal of this project is to quantify the differences between image features computed by several feature extraction tools. Extracted features can be classified into 3 main categories: (1) shape features, such as the circularity, perimeter, area etc., (2) intensity features, such as the mean, standard deviation, skewness, entropy, etc., and (3) texture feature, such as the contrast, energy, homogeneity, etc. Our approach is to identify implementations of the same image features across multiple software packages, extract numerical values of image features, and compute statistical variance of image features. We analyze the following software packages: ImageJ, Protein Subcellular Location Image Database (PSLID) from the Murphy Lab in Carnegie Mellon University, and our NIST feature extraction tools in MATLAB, Java, and Python. A Python script is written to read the features calculated by these software libraries, execute the feature extraction algorithms, and compute variances for all common image features. All commonly found image features will be rank-ordered based on the variances to assist biologists in achieving repeatability of image measurements.
How did you find out that you could do research in your field in the summer?
I knew since high school that I could do research, as I had previously interned at the Johns Hopkins University Applied Physics Laboratory the summer before my senior year, and continued it until the end of my winter break.
How did you know that research at NIST was what you wanted to do?
I knew older students from the Meyerhoff Scholars Program – many of whom were also math majors – who had previously interned there and enjoyed their experience.
Did you apply to other places?
As a freshman Meyerhoff Scholar, I was required to apply for 17 summer research positions, but I applied for a few more, so like 20 or 21. I received two offers, in addition to NIST and a few rejections, but after I got accepted to NIST, I withdrew as many of my applications as I could.
Was the application difficult to do? Did you have help with this?
The application was fairly straightforward. Before my freshman year, I had already made a resume, and then had it reviewed by Ms. Janet McGlynn and the Career Center. The personal statement was more challenging, but I received invaluable help and assistance from some older Meyerhoff scholars. Without their help, I might not have gotten this internship. I also went to the Writing Center a few times.
What was your summer research project?
My research was in Bioimage-informatics, which is a subset of Bioinformatics. I studied and even coded up software that calculated the image features of cells (essentially, their quantitative properties, such as area, intensity, contrast, etc.). I then documented these features into csv files divided by each image feature extraction software and extracted common features. I then performed statistical analysis on the differences between the software and tried to quantify and explain those differences.
Who is your mentor for your research project? How did you arrange to work with this person?
My mentor was Dr. Joe Chalfoun, one of the scientists and engineers working on this project in the Software and Systems division of the Information Technology Laboratory at NIST.
How much time do you put into this work?
I worked from 8:45 to 5:00 on weekdays for 11 weeks this summer.
Were you paid? Where did you live?
I received a stipend of $5,500 for my work at NIST. I lived in Quality Suites, a nearby hotel (paid for by the program), which had a shuttle bus take us to and from NIST every day. Unlike most of the interns living at the hotel, who had to live with a roommate, I actually got a single, due to being the lucky guy out of an odd number of males who needed housing, which, being an introvert who enjoys his private space after a long day, I enjoyed very much. I did get to interact with the other interns on multiple occasions, however. One of my suitemates, who was also at NIST this summer, formed a nature group among the interns, so we would occasionally go hiking on the weekends. There was also another group that went to DC every now and then, so I would sometimes go with them and some friends.
What academic background did you have before you started?
I had just completed my freshman year, where I took a semester of mathematical modeling and statistics, and a year of real analysis (though out of the three, I only used techniques from my statistics class for this internship). I also knew how to program in Java and Python, and perform advanced mathematical calculations in MATLAB and Mathematica.
How did you learn what you needed to know for this project?
I had to learn a lot on my own, specifically about image features and how to use the various software used to calculate them. I received a little assistance from my mentor and one of the scientists with whom I was collaborating on this project, but for the most part, I read some papers on the software and went through a few tutorials.
What was the hardest part about your research?
The hardest part was definitely the beginning. Shortly after I came in, my mentor gave me some incomplete feature extraction software written in MATLAB and wanted me to finish it within two weeks, and make it user-friendly. Unfortunately, due to the level of proficiency with MATLAB required, it took me a lot longer than he had expected, as in spite of my mathematical skill in MATLAB, my programming background in the computer algebra software was very limited.
What was the most unexpected thing?
How casual the staff at NIST were dressed, especially in the lab I worked in, since it was primarily computer science, so no one was working with chemicals or anything like that. Most of them wore jeans, t-shirts, sometimes even shorts! One of the scientists there even worked barefoot. It was a big culture shock from my previous internship at APL, where most of the staff dressed business casual (so a button-down shirt and slacks, with the most casual being a polo), and I only saw one scientist in jeans and a t-shirt.
How does this research relate to your course work?
My research involved a lot of coding and later on statistics. As a mathematics and statistics double-major with a computer science minor, it relates a lot to my coursework, especially in the latter two fields.
What is your advice to other students about getting involved in research?
Apply for as many research positions as possible, especially if you’re a freshman with limited relevant coursework and research experience. Also, take programming courses, or at least teach yourself programming, depending on your major (i.e., if you’re a STEM major). Almost every intern I talked to at NIST, regardless of their major, was doing some form of programming.
What are your career goals?
After I graduate from UMBC, I plan to go to graduate school in either applied mathematics, bioinformatics, or computer science. I’m interested in a career that involves mathematical modeling, statistics, and computer science.
Are you a transfer student or did you start at UMBC as a freshman?
I actually started at UMBC as a junior in high school (I was homeschooled). Based on my SAT scores, I was admitted to UMBC’s Young Scholars program, which allowed me to take classes as a high school student and earn credit. I later enrolled as a freshman at UMBC after graduating from high school.
Do you now live on campus or commute to UMBC?
I live on campus.