An Automated Customizable Live Web Crawler for Curation of Comparative Pharmacokinetic Data: An Intelligent Compilation of Research-Based Comprehensive Article Repository
Abstract
:1. Introduction and Background
2. Methodology for PK Web Navigating/Crawling System
2.1. The Architecture of Web Crawler for PK (WCPK)
- Identifying drugs with other names from the ATC Classification System and defining the same active ingredients for different routes of administration. Various sources and countries generate phrases and terminologies for each data field, and the drug names are represented by several aliases, including generic, brand (trade), and international names, medicinal formulations, active substances or active ingredients, as well as mismatched phrases and special (or international) characters. We then need to consolidate the combination of active ingredients, generic names, brand names, etc. Hence, the drug names are mapped to drug parents using the DrugBank database (Alberta Innovates–Health Solutions, The Metabolomics Innovation Center) [23] and the Kegg drug database [24,25]. These similar active ingredients are presented in Supplementary Materials Table S1;
- Metadata search queries from TDM API service providers such as Scopus, Springer, Crossref, arXiv, BioMed Central, PubMed, Web of Science, IEEE, and Public Library of Science must be defined. The following is a search query from the API service provider Scopus: Search ((ALL((drugs) AND (clearance) AND (volume of distribution) AND (route)))), for metadata retrieval;
- Full-text API calls: Scopus API, Springer API, Crossref REST API for full-text access;
2.2. The Module of Article Metadata Extraction
Procedure 1. Module of Article Metadata Extraction |
procedure doi_scopusSearch (drug list, PK parameters): |
begin |
derive search query input variables from the procedure parameters based on drug class. |
if variables# (drug names, clearance, volume of distribution, route) are available: |
execute ScopusSearch query (input variables) to retrieve all metadata information. |
if (outcome is NOT NULL): # |
save metadata results to Data Frame |
save metadata results to a CSV file. |
end |
output: articles metadata file for desired drug class |
2.3. The Module of Duplicate Handling
Procedure 2. Module of Duplicate Handling |
procedure doi_duplicateHandling (doi, subtype, subtypeDescription): |
begin |
input variables—doi, subtype, subtypeDescription from the metadata file |
if input variables are NOT NULL: |
invoke duplicateHandler module: |
if doi is NOT NULL: |
drop duplicates if exist. |
if (subtype is bk): |
save doi with corresponding information in book-doi CSV file. |
else: |
invoke springerLookup module: |
if doi matches certain semantic rules: |
save doi information in springer-doi CSV file. |
else: |
save doi information in Scopus-doi CSV file. |
end |
output: book-doi, Springer-doi, Scopus-doi CSV files. |
2.4. The Module of Full-Text Retrieval
Procedure 3. Module of Full-Text Retrieval |
procedure doi_fulltextRetrieval (doi, API key, file-folder): |
begin |
input variables—doi, API key, file-folder |
for doi in doi-list: |
case 1: |
append doi to doi_visited list. |
invoke Scopus-API handler: |
if webpage returns: |
full text (XML, or JSON) then |
save content. |
elseif: full text(HTML) then |
if the Methods and Results sections return true then |
save content. |
else: |
append doi to unsuccessful.csv file. |
case 2: |
append doi to doi_visited list |
invoke Springer-API handler: |
if webpage returns: |
full text (XML) then |
save content. |
else: |
append doi to unsuccessful.csv file. |
case 3: |
append doi to doi_visited list. |
invoke Crossref-API handler: |
if webpage returns: |
full text (XML, or PDFs) then |
save content. |
else: |
append doi to unsuccessful.csv file. |
case 4: |
unsuccessful doi list |
append doi to doi_visited list. |
if a PDF link is available then |
save the content. |
else: |
return incomplete-DOI list. |
end |
output: XMLs, HTMLs, PDFs, incomplete-DOI list |
3. Results
4. Discussion
5. Conclusions
Supplementary Materials
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Riviere, J.E. Comparative Pharmacokinetics: Principles, Techniques and Applications; John Wiley & Sons: Hoboken, NJ, USA, 2011. [Google Scholar]
- Grzegorzewski, J.; Brandhorst, J.; Green, K.; Eleftheriadou, D.; Duport, Y.; Barthorscht, F.; Köller, A.; Ke, D.Y.J.; De Angelis, S.; König, M. PK-DB: Pharmacokinetics Database for Individualized and Stratified Computational Modeling. Nucleic Acids Res. 2021, 49, D1358–D1364. [Google Scholar] [CrossRef]
- Jambhekar, S.S.; Breen, P.J. Basic Pharmacokinetics; Pharmaceutical Press: London, UK, 2009; Volume 76. [Google Scholar]
- Garralda, E.; Dienstmann, R.; Tabernero, J. Pharmacokinetic/Pharmacodynamic Modeling for Drug Development in Oncology. Am. Soc. Clin. Oncol. Educ. Book 2017, 37, 210–215. [Google Scholar] [CrossRef] [PubMed]
- Meibohm, B.; Derendorf, H. Basic Concepts of Pharmacokinetic/Pharmacodynamic (PK/PD) Modelling. Int. J. Clin. Pharmacol. Ther. 1997, 35, 401–413. [Google Scholar] [PubMed]
- Ratain, M.J.; Plunkett, W.K., Jr. Principles of Pharmacokinetics. In Holland-Frei Cancer Medicine, 6th ed.; Kufe, D.W., Pollock, R.E., Weichselbaum, R.R., Bast, R.C., Jr., Gansler, T.S., Holland, J.F., Frei, E., III, Eds.; BC Decker: Hamilton, ON, Canada, 2003. Available online: https://www.ncbi.nlm.nih.gov/books/NBK12815/ (accessed on 12 September 2022).
- Pandey, S.; Olston, C. User-Centric Web Crawling. In Proceedings of the 14th International Conference on World Wide Web, WWW’05, Chiba, Japan, 10–14 May 2005; ACM Press: New York, NY, USA, 2005; p. 401. [Google Scholar] [CrossRef]
- Text and Data Mining at Springer Nature. Available online: https://www.springernature.com/gp/researchers/text-and-data-mining (accessed on 12 September 2022).
- Text and Data Mining at MIT|Scholarly Communications—MIT Libraries. Available online: https://libraries.mit.edu/scholarly/publishing/text-and-data-mining-at-mit/ (accessed on 12 September 2022).
- Text and Data Mining. Available online: https://it.lbl.gov/service/library/databases/text-and-data-mining/ (accessed on 12 September 2022).
- Scopus Search API. Available online: https://dev.elsevier.com/documentation/SCOPUSSearchAPI.wadl (accessed on 23 August 2022).
- Bartell, A. Documentation. Crossref. Available online: https://www.crossref.org/documentation/ (accessed on 26 August 2022).
- Springer API. Available online: https://dev.springernature.com/docs (accessed on 23 August 2022).
- arXiv API Access|arXiv e-Print Repository. Available online: https://arxiv.org/help/api/ (accessed on 18 October 2022).
- APIs-Develop-NCBI. Available online: https://www.ncbi.nlm.nih.gov/home/develop/api/ (accessed on 18 October 2022).
- PLOS API|. Available online: https://api.plos.org/ (accessed on 18 October 2022).
- Clarivate Developer Portal—Web of Science API Expanded. Available online: https://developer.clarivate.com/apis/wos (accessed on 18 October 2022).
- bioRxiv API. Available online: https://api.biorxiv.org/ (accessed on 1 November 2022).
- bioRxiv.org—The Preprint Server for Biology. Available online: https://www.biorxiv.org/ (accessed on 1 November 2022).
- medRxiv API. Available online: https://api.medrxiv.org/ (accessed on 1 November 2022).
- Payne, M.A.; Craigmill, A.L.; Riviere, J.E.; Baynes, R.E.; Webb, A.I.; Sundlof, S.F. The Food Animal Residue Avoidance Databank (Farad): Past, Present and Future. Vet. Clin. N. Am. Food Anim. Pract. 1999, 15, 75–88. [Google Scholar] [CrossRef] [PubMed]
- Sidhu, P.K.; Gehring, R.; Mzyk, D.A.; Marmulak, T.; Tell, L.A.; Baynes, R.E.; Vickroy, T.W.; Riviere, J.E. Avoiding Violative Flunixin Meglumine Residues in Cattle and Swine. J. Am. Vet. Med. Assoc. 2017, 250, 182–189. [Google Scholar] [CrossRef]
- Wishart, D.S.; Knox, C.; Guo, A.C.; Shrivastava, S.; Hassanali, M.; Stothard, P.; Chang, Z.; Woolsey, J. DrugBank: A Comprehensive Resource for in Silico Drug Discovery and Exploration. Nucleic Acids Res. 2006, 34 (Suppl. S1), D668–D672. [Google Scholar] [CrossRef]
- Kanehisa, M. Toward Pathway Engineering: A New Database of Genetic and Molecular Pathways. Sci. Technol. Jpn. 1996, 59, 34–38. [Google Scholar]
- Kanehisa, M.; Furumichi, M.; Sato, Y.; Kawashima, M.; Ishiguro-Watanabe, M. KEGG for Taxonomy-Based Analysis of Pathways and Genomes. Nucleic Acids Res. 2022, 51, D587–D592. [Google Scholar] [CrossRef]
- WebDriver API—Selenium Python Bindings 2 Documentation. Available online: https://selenium-python.readthedocs.io/api.html (accessed on 13 September 2022).
- ChromeDriver—WebDriver for Chrome—Getting Started. Available online: https://chromedriver.chromium.org/getting-started (accessed on 1 November 2022).
- WHOCC—ATC/DDD Index. Available online: https://www.whocc.no/atc_ddd_index/?code=J04B&showdescription=no (accessed on 25 August 2022).
- WHOCC—ATCvet Index. Available online: https://www.whocc.no/atcvet/atcvet_index/ (accessed on 31 October 2022).
- 1DATA. Available online: https://1data.life/ (accessed on 24 March 2023).
- Rose, M.E.; Kitchin, J.R. Pybliometrics: Scriptable Bibliometrics Using a Python Interface to Scopus. SoftwareX 2019, 10, 100263. [Google Scholar] [CrossRef]
- Scopus Search Guide. Available online: http://schema.elsevier.com/dtds/document/bkapi/search/SCOPUSSearchTips.htm (accessed on 23 August 2022).
- Paskin, N. Toward Unique Identifiers. Proc. IEEE 1999, 87, 1208–1227. [Google Scholar] [CrossRef]
- Python Release Python 3.10.0. Python.org. Available online: https://www.python.org/downloads/release/python-3100/ (accessed on 23 August 2022).
- What Is an API?—API Beginner’s Guide—AWS. Amazon Web Services, Inc. Available online: https://aws.amazon.com/what-is/api/ (accessed on 23 August 2022).
- What is an Application Programming Interface (API). Available online: https://www.ibm.com/cloud/learn/api (accessed on 23 August 2022).
- Bartell, A. Text and Data Mining for Researchers. Crossref. Available online: https://www.crossref.org/documentation/retrieve-metadata/rest-api/text-and-data-mining-for-researchers/ (accessed on 26 August 2022).
- DOI Registration Agencies. Available online: https://www.doi.org/registration_agencies.html (accessed on 2 November 2022).
- DOI Registration Agencies. Available online: https://www.doi.org/RA_Coverage.html (accessed on 2 November 2022).
- Schedule—Schedule 1.1.0 documentation. Available online: https://schedule.readthedocs.io/en/stable/ (accessed on 24 October 2022).
- Brucker, P. Scheduling Algorithms, 4th ed.; Springer: Berlin/Heidelberg, Germany, 2004; pp. I–XII, 1–367. [Google Scholar] [CrossRef]
- Wu, X.; Deng, M.; Zhang, R.; Zeng, B.; Zhou, S. A task scheduling algorithm based on QoS-driven in cloud computing. Procedia Comput. Sci. 2013, 17, 1162–1169. [Google Scholar] [CrossRef]
- What Are Scopus APIs and How Are These Used? Available online: https://www.elsevier.com/__data/assets/pdf_file/0007/917179/Scopus-User-Community-Germany-API-final.pdf (accessed on 18 October 2022).
- Content Coverage Guide—Elsevier. Available online: https://www.elsevier.com/__data/assets/pdf_file/0007/69451/Scopus_ContentCoverage_Guide_WEB.pdf (accessed on 18 October 2022).
- National Research Council (US) Committee on Drug Use in Food Animals. 1, Drugs Used in Food Animals: Background and Perspectives. In The Use of Drugs in Food Animals: Benefits and Risks; National Academies Press: Washington, DC, USA, 1999. Available online: https://www.ncbi.nlm.nih.gov/books/NBK232562/ (accessed on 18 October 2022).
- The Pandas Development Team. pandas-dev/pandas: Pandas. Zenodo 2020, 21, 1–9. [Google Scholar]
- Millagaha Gedara, N.I.; Xu, X.; DeLong, R.; Aryal, S.; Jaberi-Douraki, M. Global Trends in Cancer Nanotechnology: A Qualitative Scientific Mapping Using Content-Based and Bibliometric Features for Machine Learning Text Classification. Cancers 2021, 13, 4417. [Google Scholar] [CrossRef] [PubMed]
- Text and Data Mining Help—Wiley Online Library. Available online: https://onlinelibrary.wiley.com/library-info/resources/text-and-datamining (accessed on 26 August 2022).
- Liu, Q.; Wang, Y. Determination of Rosamultin in Rat Plasma by LC–MS/MS and Its Application to a Pharmacokinetic Study. Biomed. Chromatogr. 2020, 34, e4728. [Google Scholar] [CrossRef]
- Kapralos, I.; Mainas, E.; Neroutsos, E.; Apostolidi, S.; Siopi, M.; Apostolopoulou, O.; Dimopoulos, G.; Sambatakou, H.; Valsami, G.; Meletiadis, J.; et al. Population pharmacokinetics of micafungin over repeated doses in critically ill patients: A need for a loading dose? J. Pharm. Pharmacol. 2020, 72, 1750–1760. [Google Scholar] [CrossRef]
- Wanmad, W.; Chomcheun, T.; Jongkolpath, O.; Klangkaew, N.; Phaochoosak, N.; Sukkheewan, R.; Laovechprasit, W.; Khidkhan, K.; Giorgi, M.; Poapolathep, A.; et al. Pharmacokinetic characteristics of danofloxacin in green sea (Chelonia mydas) and hawksbill sea (Eretmochelys imbricata) turtles. J. Vet. Pharmacol. Ther. 2022, 45, 402–408. [Google Scholar] [CrossRef]
- Medellín-Garibay, S.E.; Milán-Segovia, R.D.C.; Magaña-Aquino, M.; Portales-Pérez, D.P.; Romano-Moreno, S. Pharmacokinetics of rifampicin in Mexican patients with tuberculosis and healthy volunteers. J. Pharm. Pharmacol. 2014, 66, 1421–1428. [Google Scholar] [CrossRef]
- Hamidi, M. Central nervous system distribution kinetics of indinavir in rats. J. Pharm. Pharmacol. 2007, 59, 1077–1085. [Google Scholar] [CrossRef]
- Future Medicine|Home. Future Medicine. Available online: https://www.futuremedicine.com/ (accessed on 28 October 2022).
- Future Science|Home. Future Science. Available online: https://www.future-science.com/ (accessed on 28 October 2022).
- Dustri Online Services. Available online: https://www.dustri.com/ (accessed on 28 October 2022).
- Welcome to Bentham Science Publisher. Available online: https://www.eurekaselect.com/ (accessed on 28 October 2022).
- Transactions of The Royal Society of Tropical Medicine and Hygiene|Oxford Academic. Available online: https://academic.oup.com/trstmh (accessed on 5 December 2022).
- Pharmacological Reports|All Journal Issues|ScienceDirect.com by Elsevier. Available online: https://www.sciencedirect.com/journal/pharmacological-reports/issues (accessed on 5 December 2022).
- Belič, A.; Karba, R.; Grabnar, I.; Mrhar, A. Data Mining in Drug and Therapy Design. IFAC Proc. Vol. 2002, 35, 211–215. [Google Scholar] [CrossRef]
- Karimi, S.; Wang, C.; Metke-Jimenez, A.; Gaire, R.; Paris, C. Text and Data Mining Techniques in Adverse Drug Reaction Detection. ACM Comput. Surv. 2015, 47, 1–39. [Google Scholar] [CrossRef]
- Hammann, F.; Drewe, J. Data Mining for Potential Adverse Drug–Drug Interactions. Expert Opin. Drug Metab. Toxicol. 2014, 10, 665–671. [Google Scholar] [CrossRef] [PubMed]
- Sun, J.; Sun, F.; Yan, B.; Li, J.; Xin, D. Data Mining and Systematic Pharmacology to Reveal the Mechanisms of Traditional Chinese Medicine in Mycoplasma Pneumoniae Pneumonia Treatment. Biomed. Pharmacother. 2020, 125, 109900. [Google Scholar] [CrossRef] [PubMed]
- Uno, T.; Wada, K.; Hosomi, K.; Matsuda, S.; Ikura, M.M.; Takenaka, H.; Terakawa, N.; Oita, A.; Yokoyama, S.; Kawase, A.; et al. Drug Interactions between Tacrolimus and Clotrimazole Troche: A Data Mining Approach Followed by a Pharmacokinetic Study. Eur J. Clin. Pharmacol. 2020, 76, 117–125. [Google Scholar] [CrossRef] [PubMed]
- Vilar, S.; Friedman, C.; Hripcsak, G. Detection of Drug–Drug Interactions through Data Mining Studies Using Clinical Sources, Scientific Literature and Social Media. Brief. Bioinform. 2018, 19, 863–877. [Google Scholar] [CrossRef] [PubMed]
- Stage, T.B.; Bergmann, T.K.; Kroetz, D.L. Clinical Pharmacokinetics of Paclitaxel Monotherapy: An Updated Literature Review. Clin. Pharmacokinet. 2018, 57, 7–19. [Google Scholar] [CrossRef]
- Hauben, M. Early Postmarketing Drug Safety Surveillance: Data Mining Points to Consider. Ann. Pharmacother. 2004, 38, 1625–1630. [Google Scholar] [CrossRef]
- Xu, X.; Kawakami, J.; Gedara, N.I.M.; Riviere, J.E.; Meyer, E.; Wyckoff, G.J.; Jaberi-Douraki, M. Data Mining Methodology for Response to Hypertension Symptomology—Application to COVID-19-Related Pharmacovigilance. Elife 2021, 10, e70734. [Google Scholar] [CrossRef]
- Xu, X.; Mazloom, R.; Goligerdian, A.; Staley, J.; Amini, M.; Wyckoff, G.J.; Riviere, J.; Jaberi-Douraki, M. Making Sense of Pharmacovigilance and Drug Adverse Event Reporting: Comparative Similarity Association Analysis Using AI Machine Learning Algorithms in Dogs and Cats. Top. Companion Anim. Med. 2019, 37, 100366. [Google Scholar] [CrossRef]
- Jaberi-Douraki, M.; Taghian Dinani, S.; Millagaha Gedara, N.I.; Xu, X.; Richards, E.; Maunsell, F.; Zad, N.; Tell, L.A. Large-Scale Data Mining of Rapid Residue Detection Assay Data From HTML and PDF Documents: Improving Data Access and Visualization for Veterinarians. Front. Vet. Sci. 2021, 8, 674730. [Google Scholar] [CrossRef]
- Zad, N.; Tell, L.A.; Ramachandran, R.A.; Xu, X.; Riviere, J.E.; Baynes, R.; Lin, Z.; Maunsell, F.; Davis, J.; Jaberi-Douraki, M. Development of Machine Learning Algorithms to Estimate Maximum Residue Limits for Veterinary Medicines. Food Chem. Toxicol. 2023. under review. [Google Scholar]
- de Stefano, E.; de Sequeira Santos, M.P.; Balassiano, R. Development of a software for metric studies of transportation engineering journals. Scientometrics 2016, 109, 1579–1591. [Google Scholar] [CrossRef]
- Peter, K.; Christopher, K.; Asura, E. Open knowledge maps: Creating a visual interface to the world’s scientific knowledge based on natural language processing. Z. Bibl. 2016, 4, 98–103. [Google Scholar] [CrossRef]
- Wu, J.; Kim, K.; Giles, C.L. CiteSeerX: 20 years of service to scholarly big data. In Proceedings of the Conference on Artificial Intelligence for Data Discovery and Reuse, Pittsburgh, Pennsylvania, 13–15 May 2019; pp. 1–4. [Google Scholar] [CrossRef]
- Wildgaard, L. A comparison of 17 author-level bibliometric indicators for researchers in Astronomy, Environmental Science, Philosophy and Public Health in Web of Science and Google Scholar. Scientometrics 2015, 104, 873–906. [Google Scholar] [CrossRef]
- Arora, S.K.; Youtie, J.; Shapira, P.; Gao, L.; Ma, T. Entry strategies in an emerging technology: A pilot web-based study of graphene firms. Scientometrics 2013, 95, 1189–1207. [Google Scholar] [CrossRef]
- Björneborn, L.; Ingwersen, P. Perspective of webometrics. Scientometrics 2004, 50, 65–82. [Google Scholar] [CrossRef]
- Holmberg, K.; Thelwall, M. Local government web sites in Finland: A geographic and webometric analysis. Scientometrics 2008, 79, 157–169. [Google Scholar] [CrossRef]
- Sud, P.; Thelwall, M. Linked title mentions: A new automated link search candidate. Scientometrics 2014, 101, 1831–1849. [Google Scholar] [CrossRef]
- Kumar, R.; Jain, A.; Agrawal, C. Survey of Web Crawling Algorithms. Adv. Vis. Comput. Int. J. 2016, 3, 1–7. [Google Scholar] [CrossRef]
- Shen, S.; Liu, J.; Lin, L.; Huang, Y.; Zhang, L.; Liu, C.; Feng, Y.; Wang, D. SsciBERT: A pre-trained language model for social science texts. Scientometrics 2022, 128, 1241–1263. [Google Scholar] [CrossRef]
- Mary, J.D.P.N.R.; Balasubramanian, S.; Raj, R.S.P. An Enhanced Focused Web Crawler for Biomedical Topics Using Attention Enhanced Siamese Long Short Term Memory Networks. Braz. Arch. Biol. Technol. 2022, 64, e21210163. [Google Scholar] [CrossRef]
- Aronsky, D.; Madani, S.; Carnevale, R.J.; Duda, S.; Feyder, M.T. The Prevalence and Inaccessibility of Internet References in the Biomedical Literature at the Time of Publication. J. Am. Med. Inform. Assoc. 2007, 14, 232–234. [Google Scholar] [CrossRef] [PubMed]
- Wget—GNU Project—Free Software Foundation. Available online: https://www.gnu.org/software/wget/ (accessed on 13 September 2022).
- Pérez-Rodríguez, G.; Pérez-Pérez, M.; Fdez-Riverola, F.; Lourenço, A. Online Visibility of Software-Related Web Sites: The Case of Biomedical Text Mining Tools. Inf. Process. Manag. 2019, 56, 565–583. [Google Scholar] [CrossRef]
- Jsoup: Java HTML Parser, Built for HTML Editing, Cleaning, Scraping, and XSS Safety. Available online: https://jsoup.org/ (accessed on 13 September 2022).
- Xu, S.; Yoon, H.-J.; Tourassi, G. A User-Oriented Web Crawler for Selectively Acquiring Online Content in e-Health Research. Bioinformatics 2014, 30, 104–114. [Google Scholar] [CrossRef] [PubMed]
- Zhang, Y.; Chen, J.; Liu, B.; Yang, Y.; Li, H.; Zheng, X.; Chen, X.; Ren, T.; Xiong, N. COVID-19 Public Opinion and Emotion Monitoring System Based on Time Series Thermal New Word Mining. arXiv 2005, arXiv:2005.11458. [Google Scholar] [CrossRef]
- Mukherjea, S.; Bamba, B.; Kankar, P. Information Retrieval and Knowledge Discovery Utilizing a Biomedical Patent Semantic Web. IEEE Trans. Knowl. Data Eng. 2005, 17, 1099–1110. [Google Scholar] [CrossRef]
- Regular Expression HOWTO—Python 3.10.7 Documentation. Available online: https://docs.python.org/3/howto/regex.html (accessed on 13 September 2022).
- Kaur, G. Usage of Regular Expressions in NLP. Int. J. Res. Eng. Technol. 2014, 3, 7. [Google Scholar]
- Zhang, S.; He, L.; Vucetic, S.; Dragut, E. Regular Expression Guided Entity Mention Mining from Noisy Web Data. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium, October 31–November 4 2018; Association for Computational Linguistics: Brussels, Belgium, 2018; pp. 1991–2000. [Google Scholar]
- gFARAD. Available online: https://1data.life/gFARAD/gFARAD.php (accessed on 13 September 2022).
QJ | QD | QP | QH |
---|---|---|---|
QJ01 Antibacterials for Systemic Use QJ02 Antimycotics for Systemic Use QJ04 Antimycobacterials QJ05 Antivirals for Systemic Use QJ51 Antibacterials for Intramammary Use QJ54 Antimycobacterials for Intramammary Use | QD01 Antifungals for Dermatological Use QD02 Emollients and Protectives QD03 Preparations for Treatment of Wounds and Ulcers QD04 Antipruritics, Incl. Antihistamines, Anesthetics, Etc. QD05 Drugs for Keratoseborrheic Disorders (Atc Human: Antipsoriatics) QD06 Antibiotics and Chemotherapeutics for Dermatological Use QD07 Corticosteroids, Dermatological Preparations QD08 Antiseptics and Disinfectants QD09 Medicated Dressings QD10 Anti-Acne Preparations QD11 Other Dermatological Preparations QD51 Products for the Treatment of Claws and Hoofs | QP51 Antiprotozoals QP52 Anthelmintics QP53 Ectoparaciticides, Insecticides, and Repellents QP54 Endectocides | QH01 Pituitary and Hypothalamic Hormones and Analogues QH02 Corticosteroids for Systemic Use QH03 Thyroid Therapy QH04 Pancreatic Hormones QH05 Calcium Homeostasis |
Parameter | Items in Each Query Sent via Scopus Search in SQ (1) |
---|---|
Drugs | “arestin” OR “aureomycin” OR “bristacycline” OR “chlortetracycline” OR “chlortetracycline AND and AND bisulfate” OR “chlortetracycline AND and AND hydrochloride” OR “clomocycline” OR “declomycin” OR “demeclocycline” OR “demeclocycline AND and AND hydrochloride” OR “demethylchlortetracycline” OR “demethylchlortetracycline AND and AND hydrochloride” OR “doxychel” OR “doxycycline” OR “doxycycline AND and AND calcium” OR “doxycycline AND and AND fosfatex” OR “doxycycline AND and AND hyclate” OR “doxycycline AND and AND hydrate” OR “doxycycline AND and AND hydrochloride” OR “doxycycline AND and AND hydrochloride AND and AND hydrate” OR “dynacin” OR “eravacycline” OR “eravacycline AND and AND dihydrochloride” OR “lymecycline” OR “lymepak” OR “metacycline” OR “methacycline” OR “methacycline AND and AND hydrochloride” OR “minocin” OR “minocycline” OR “minocycline AND and AND hydrochloride” OR “monodox” OR “nuzyra” OR “omadacycline” OR “omadacycline AND and AND tosylate” OR “oracea” OR “oxytetracycline” OR “oxytetracycline AND and AND calcium” OR “oxytetracycline AND and AND dihydrate” OR “oxytetracycline AND and AND hydrochloride” OR “penimepicycline” OR “periostat” OR “rolitetracycline” OR “rolitetracycline AND and AND nitrate” OR “rondomycin” OR “sarecycline” OR “sarecycline AND and AND hydrochloride” OR “seysara” OR “solodyn” OR “sumycin” OR “synterin” OR “terramycin” OR “tetracycline” OR “tetracycline AND and AND hydrochloride” OR “tetracycline AND and AND hydrochloride AND and AND epidihydrocholesterin” OR “tetracycline AND and AND hydrochloride AND and AND hydrocortisone AND and AND acetate” OR “tetracycline AND and AND metaphosphate” OR “tetracycline AND and AND phosphate AND and AND complex” OR “tetracycline AND and AND presteron” OR “tetrax” OR “tigecycline” OR “tygacil” OR “vibramycin” OR “xerava” |
Routes | “i.m” OR “i.m. OR “im” OR “im” OR “Intra” OR “intraarterial” OR “intra-arterial” OR “intra-articular” OR “intra-articularly” OR “Intramammary” OR “Intramuscular” OR “Intra-muscular” OR “Intranasal” OR “intravenous” OR “nasogastric” OR “ocular” OR “ointment” OR “oral” OR “orally” OR “OTM” OR “p.” OR “p.o” OR “p o.” OR “p.o.” OR “po” OR “s.c” OR “s.c.” OR “sc” OR “sc.” OR “Subcutaneous” OR “Topically” OR “trans” OR “transdermal” OR “trans-dermal” OR “transdermally” OR “transmucosal” OR “dermal” OR “nasal” OR “o” OR “Topical” |
Extracted Data from Scopus Metadata | eid|doi|pii|pubmed_id|title|subtype|subtypeDescription|creator|afid|affilname|affiliation_city|affiliation_country|author_count|author_names|author_ids|author_afids|coverDate|coverDisplayDate|publicationName|issn|source_id|eIssn|aggregationType|volume|issueIdentifier|article_number|pageRange|description|authkeywords|citedby_count|openaccess|freetoread|freetoreadLabel|fund_acr|fund_no|fund_sponsor |
Example of metadata obtained from Scopus | 2-s2.0-85131400463|10.1016/j.ejps.2022.106219|S092809872200104X|35,618,200|Amikacin pharmacokinetics in elderly patients with severe infections|ar|Article|Medellín-Garibay S.E.|60032541; 60031335; 60025844; 60,016,574|Hospital Severo Ochoa; Universidad Autonoma de San Luis Potosi; Hospital Universitario Puerta de Hierro Majadahonda; Universidad Complutense de Madrid, Facultad de Farmacia|Leganes;San Luis Potosí;Majadahonda; Madrid|Spain; Mexico; Spain; Spain|8|Medellín-Garibay, Susanna E.; Romano-Aguilar, Melissa; Parada, Alejandro; Suárez, David; Romano-Moreno, Silvia;Barcia, Emilia;Cervero, Miguel;García, Benito|36610623800; 57210937049; 57729583200; 57730179100; 57218390495; 6603720469; 57222997862; 49761285500|60031335; 60031335; 60031335; 60032541-60025844; 60031335; 60016574; 60032541-60025844; 60032541-60025844|8/1/2022|1-Aug-2022|European Journal of Pharmaceutical Sciences|9,280,987|21,331|18,790,720|Journal|175||106219| |Objective: The aim of this study was to characterize the population pharmacokinetics of amikacin in elderly patients by means of nonlinear mixed effects modelling and to propose initial dosing schemes to optimize therapy based on PK/PD targets. Method: A total of 137 elderly patients from 65 to 94 years receiving intravenous amikacin and routine therapeutic drug monitoring at Hospital Universitario Severo Ochoa were included. Concentration–time data and clinical information were retrospectively collected; initial doses of amikacin ranged from 5.7 to 22.5 mg/kg/day and each patient provided between 1 and 10 samples. Results: Amikacin pharmacokinetics were best described by a two-compartment open model; creatinine clearance (CrCL) was related to drug clearance (2.75 L/h/80 mL/min) and it was augmented 28% when non-steroidal anti-inflammatory drugs were concomitantly administered. Body mass index (BMI) influenced the central volume of distribution (17.4 L/25 kg/m2). Relative absolute prediction error was reduced from 33.2% (base model) to 17.9% (final model) when predictive performance was evaluated with a different group of elderly patients. A nomogram for initial amikacin dosage was developed and evaluated based on stochastic simulations considering final model to achieve PK/PD targets (Cmax/MIC > 10 and AUC/MIC > 75) and to avoid toxic threshold (Cmin < 2.5 mg/L). Conclusion: Initial dosing approach for amikacin was designed for elderly patients based on nonlinear mixed effects modeling to maximize the probability to attain efficacy and safety targets considering individual BMI and CrCL.| Antiinfectives, Clinical pharmacokinetics, Individualized drug therapy, Pharmacometrics, Population pharmacokinetics, Special populations, Therapeutic drug monitoring|0|1|publisherfullgold|Gold||undefined| | |
Query URL | Outcome |
---|---|
https://api.crossref.org/works/10.12998/wjcc.v10.i18.6218/agency (accesses on 25 October 2022) | status: ok message-type: work-agency message-version: 1.0.0 message: {DOI:10.12998/wjcc.v10.i18.6218} agency: { id: crossref, label: Crossref} |
https://api.crossref.org/works/10.22038/ijp.2017.26942.2320/agency (accesses on 25 October 2022) | status: ok message-type: work-agency message-version: 1.0.0 message: {DOI:10.22038/ijp.2017.26942.2320} agency: {id: medra, label: mEDRA} |
https://api.crossref.org/works/10.3760/cma.j.issn.2095-4352.2019.11.001/agency (accesses on 25 October 2022) | status: ok message-type: work-agency message-version: 1.0.0 message: {DOI: 10.3760/cma.j.issn.2095-4352.2019.11.001} agency: {id: istic, label: ISTIC} |
Web Crawler or API Service Provider | Number of Records: Metadata/Full-Text | Description and Notes | ||
---|---|---|---|---|
PharmacoKinetics (SQ2) | Toxicology (SQ3) | Neural Network Reduction (SQ4) | ||
Scopus | 1,228,515/38,4177 | 2,517,378/830,701 | 412,765/157,932 | Facilitates source-unbiased metadata retrieval. However, 384,177 out of 1,228,515 metadata (SQ2), 830,701 out of 2,517,378 metadata (SQ3), and 157,932 out of 412,765 metadata (SQ4) can be retrieved as open-access full-text articles. https://www.scopus.com/search/form.uri?display=advanced. (accessed on 23 August 2022). Using article retrieval APIs, open-access full-text can be retrieved by DOI (document object identifier), PII (publication item identifier), EID (electronic identifier), Scopus ID, and Pubmed ID (Medline ID). https://dev.elsevier.com/documentation/FullTextRetrievalAPI.wadl. (accessed on 23 August 2022) |
Springer | 121,467 | 653,049 | 185,134 | By using https://link.springer.com/advanced-search (accessed on 23 August 2022), 121467, 653049, and 185134 records are listed respectively for SQ2, SQ3, and SQ4. Automatic extraction of Springer Nature metadata and open-access full-text articles is possible using the Springer Nature API portal https://dev.springernature.com/ (accessed on 23 August 2022), provided we have a valid API key. |
Crossref | 70,573 | 315,269 | 1,698,727 | Mostly supports itemized search with the title, author, DOI, ORCID ID, etc., for its metadata https://search.crossref.org/ (accessed on 26 August 2022), and Crossref REST API for metadata and full-text access in a more sophisticated way. https://www.crossref.org/documentation/retrieve-metadata/rest-api/text-and-data-mining-for-researchers/ (accessed on 26 August 2022). |
Open Knowledge Maps | 100 most relevant documents | 100 most relevant documents | 100 most relevant documents | A comprehensible visualization tool for bibliometric studies. However, in the given SQ2, SQ3, and SQ4, the outcome was limited to the 100 most relevant documents out of many. https://openknowledgemaps.org/ (accessed on 24 March 2023). |
CiteseerX | 87,696 | 150,117 | 4,908,661 | A pioneer digital library that provides access to all open-access articles under one roof. Metadata extraction using web services was unsuccessful. However, full-text article downloads are labor-intensive. https://citeseerx.ist.psu.edu/ (accessed on 24 March 2023). |
Web of Science | 272,550/82,693 | 307,831/44,566 | 19,207/8747 | Web of Science offers a large collection of citation databases while the coverage depends on the institution’s subscription depth. https://www.webofscience.com/wos/woscc/advanced-search (accessed on 18 October 2022). API calls also require a paid subscription for their metadata and full-text content downloads. https://developer.clarivate.com/apis (accessed on 18 October 2022). |
Google Scholar | 1,920,000 | 4,200,000 | 3,920,000 | Relevant articles including open access and subscriptions were listed for the search keyword, while identifying and downloading full-text relevant articles for the search queries (SQ2, SQ3, and SQ4) appear to be labor-intensive. |
PubMed | 619,276/154,071 | 244,601/80,923 | 5135/2782 | Search queries resulted in a total of 619110, 244491, and 5135 records with free full-text access of 154071, 80923, and 2782, respectively, for SQ2, SQ3, and SQ4. E-Summary, E-Fetch, and OAI-PMH services provide metadata content for PMCID or PMID. However, full-text downloads are labor-intensive. https://pubmed.ncbi.nlm.nih.gov/advanced/ (accessed on 24 March 2023). |
PubMed Central | 295,140 | 180,896 | 221,477 | PubMed Central offered 295140, 180896, and 221,477 free full-text articles for SQ2, SQ3, and SQ4, respectively. https://www.ncbi.nlm.nih.gov/pmc/ (accessed on 24 March 2023). The search results are inclusive of MeSH terms while identifying relevant articles is labor-intensive. However, BioC API provides access to full-text content of all open-access articles, https://www.ncbi.nlm.nih.gov/research/bionlp/APIs/BioC-PMC/ (accessed on 24 March 2023), while other accessible PMC articles datasets include PMC Cloud Service, PMC OAI-PMH Service, PMC FTP Service, and E-Utilities. https://www.ncbi.nlm.nih.gov/pmc/tools/textmining/ (accessed on 24 March 2023). |
IEEE Xplore | 329 | 2302 | 14,700 | With an institutional subscription, a total of 329, 2302, 14,700 scientific and technical articles published by IEEE were listed for SQ2, SQ3, and SQ4, respectively. https://ieeexplore.ieee.org/Xplore/home.jsp. (accessed on 24 March 2023) In addition, IEEE metadata API and dynamic query tool permit 200 API calls per day for an account with an institution ID. https://developer.ieee.org/io-docs (accessed on 24 March 2023). |
PLOS | 11,321 | 16,228 | 81,970 | Listed a total of 11,321, 16,228, and 81,970 records for the SQ2, SQ3, and SQ4, respectively. https://journals.plos.org/plosone/search (accessed on 18 October 2022). In addition, Solr API provides access to the PLOS corpus of scientific articles. https://api.plos.org/solr/examples/ (accessed on 18 October 2022). |
WCPK (Proposed Scheme) | 1228,515/116,7089 | 2,517,378/2,391,509 | 412,765/392,126 | When compared with other metadata and article services, the proposed WCPK facilitates automatic access to source-neutral metadata content through the Scopus metadata service. The full-text article retrieval gives a total of more than 95% through Scopus, Springer, and Crossref API services as well through journal home pages when these API calls are unsupportive. |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Ampadi Ramachandran, R.; Tell, L.A.; Rai, S.; Millagaha Gedara, N.I.; Xu, X.; Riviere, J.E.; Jaberi-Douraki, M. An Automated Customizable Live Web Crawler for Curation of Comparative Pharmacokinetic Data: An Intelligent Compilation of Research-Based Comprehensive Article Repository. Pharmaceutics 2023, 15, 1384. https://doi.org/10.3390/pharmaceutics15051384
Ampadi Ramachandran R, Tell LA, Rai S, Millagaha Gedara NI, Xu X, Riviere JE, Jaberi-Douraki M. An Automated Customizable Live Web Crawler for Curation of Comparative Pharmacokinetic Data: An Intelligent Compilation of Research-Based Comprehensive Article Repository. Pharmaceutics. 2023; 15(5):1384. https://doi.org/10.3390/pharmaceutics15051384
Chicago/Turabian StyleAmpadi Ramachandran, Remya, Lisa A. Tell, Sidharth Rai, Nuwan Indika Millagaha Gedara, Xuan Xu, Jim E. Riviere, and Majid Jaberi-Douraki. 2023. "An Automated Customizable Live Web Crawler for Curation of Comparative Pharmacokinetic Data: An Intelligent Compilation of Research-Based Comprehensive Article Repository" Pharmaceutics 15, no. 5: 1384. https://doi.org/10.3390/pharmaceutics15051384
APA StyleAmpadi Ramachandran, R., Tell, L. A., Rai, S., Millagaha Gedara, N. I., Xu, X., Riviere, J. E., & Jaberi-Douraki, M. (2023). An Automated Customizable Live Web Crawler for Curation of Comparative Pharmacokinetic Data: An Intelligent Compilation of Research-Based Comprehensive Article Repository. Pharmaceutics, 15(5), 1384. https://doi.org/10.3390/pharmaceutics15051384