Dan Housman: Hello, this is Dan Housman. I’m the Chief Technology Officer for Graticule and I am the lead for the N3C’s pharma commercial collaboration team. Our goal is to help commercial organizations engage with the NIH National Cohort Collaborative, with the hope to drive more use of this very valuable resource. I’m here today with Joy Alamgir from ARIScience, and his group is one of the early adopters using the platform. We’re going to talk about how they’re working with the N3C data and got involved with it. So, Joy, I know your company’s doing drug repositioning work but if you can explain what drug repositioning is?
Joy Alamgir: Sure. Thanks, Dan. So, drug repositioning is where you take an existing drug that’s on the market indicated for some other disease, and try to use it to prevent or treat another disease. In this case coronavirus is the disease that we’re trying to interrupt. We’re trying to see through our research if any of the 1,513 FDA approved drugs has the potential to directly interrupt specific proteins of the coronavirus.
Dan Housman: What’s ARIscience doing differently from other groups, trying to do drug repositioning?
Joy Alamgir: We do supercomputing based quasi-quantum simulations to come up with candidate drugs, which have a high potential of interrupting specific proteins and their sub-structures. That’s what makes us different, we try to do the mathematical part beforehand.
Dan Housman: How does this relate to the N3C? How did you find the N3C?
Joy Alamgir: It’s very important because once you have these simulation candidates that come from our simulations, you have to test whether they are actually valid, because they’re just simulation results. The test can be done either through in vitro or in vivo tests, which are expensive, or through statistical power that the N3C system brings in. You can see through a retrospective case-cohort, where you take a bunch of patients that have been exposed to the drug and a bunch of patients that have not been exposed to the candidate drug. You can compare what their experiences have been in coronavirus. That is the real power, especially when the N3C data has about a million records with 10% COVID positivity, giving a tremendous amount of statistical power to that analysis.
Dan Housman: How did you find N3C? What was your process?
Joy Alamgir: So, we were lucky. About three months ago when we got into drug reposition for coronavirus, we were looking for patient datasets. We looked at commercial vendors and non-commercial vendors for that data and stumbled upon N3C through Google searching. We communicated with N3C and then got on board.
Dan Housman: Did you find N3C was better or worse than the commercial options or any of the options that you looked at?
Joy Alamgir: Well, now that we have access to data, it’s a very rich set of data compared to some of the other options we looked at. The number of records of patients and also the breadth of the records in N3C paint a rich picture of the data. Also, trying to get data from a commercial vendor costs money. Both of us avoid and repurpose those costs for important coronavirus research.
Dan Housman: You’ve been able to move faster than some of the large pharma groups that are interested in participating as well. Can you tell me a bit about the process to get involved?
Joy Alamgir: Sure. There are benefits of being a small biotech company, as you can be very nimble, but the N3C folks made the process quite straightforward. It’s a two-step process. The first step starts with a data use agreement so that the N3C and NIH knows what you’re going to use that data for. Once you cross that hurdle, you have to put in data utilization requests, which is a way for you to specifically say that this is what you’re going to use the data for and this is what your hypothesis is. Then they approve the data utilization requests, provided that you’re going to use the data for coronavirus research. Once after a little bit of training and tutorials you’re in the midst of the data.
Dan Housman: How did that take for your team?
Joy Alamgir: For our team, from the time we executed the DUA or started reviewing the DUA to the time we actually got approved for their data with the utilization request, it was about a month and a half.
Dan Housman: Do you think it’s still going to take the same amount of time or was that because they were still in development?
Joy Alamgir: I think it’s going to be faster because we were one of the first ones to get on board, so we had to work to the deficiencies in the system at that time. I think over time they have made the process more efficient.Overall, I think the N3C folks have been very responsive. They’ve been responsive to our questions and our concerns.
Dan Housman: Now that you’re on the other side of doing your research, what’s been your experience with the data with the community, trying to do your project?
Joy Alamgir: It’s a very collaborative community. They have a very active Slack channel with a varied number of experts. They are providing insight into data as with any system that has a million patient records. There are some data issues, but that’s nothing tremendous. We were pretty lucky enough that we understood our epidemiologists and statisticians, got their hands dirty in the data and understood how it’s structured and translated it in a way what we could actually run our statistical tests. I think that was very useful.
Dan Housman: What do you think other groups should take away? The groups that haven’t jumped on board yet?
Joy Alamgir: Well, especially for small biotech firms like us and perhaps also for larger pharmaceuticals, if you’re looking for a patient data set on coronavirus patients for statistical analysis, this is a rich dataset. It’s a million records strong with a 10% coronavirus positivity rate. In my 15 plus years of experience of dealing with health data, that’s one of the largest data sets out there, structured in a nice way.
Dan Housman: So, maybe just to close out, Joy, can you tell me a bit about how you got involved with this kind of work and your interest in the space?
Joy Alamgir: Sure. I’ve been dealing with health data for over 15 years and have worked with some of the largest health organizations here in the US. Having computation driven and intelligence driven drug design is something that I’ve been thinking about for about 10-15 years. About a couple years ago, I decided that it’s time to actually put the proof to the pudding. I got a few people involved having like-minded aims and put a team together to see if this something that we can do. Now we’re actually simulating 1513 drugs and using three different supercomputing centers. Every day we have some results trickling in, and these are the results that we want to use the N3C data for, to validate it statistically. So, it’s a really exciting time for us. I think my background the that of the team members draw us towards using N3C to see if we can make a dent into corona virus.
Dan Housman: Joy, thanks so much for all the work your team is doing.
Joy Alamgir: Thank you Dan.