tripod.nih.gov/tox21/challenge/
colab.research.google.com/drive/1bYK6DPjS69QOIOLfoEMDQK_pRljv0Vji?usp=sharing
.분자량, 용해도 또는 표면적과 같은 화학적 설명자를 나타내는 801 개의 "Dense feature"과 화학적 하위 구조를 나타내는 272,776 개의 "Sparse feature"가 있습니다 (ECFP10, DFS6, DFS8, Matrix Market Format)
이 feature들은 어떻게 얻어낸 것인가?
이 두 논문을 확인해 보자.
[Mayr2016] Mayr, A., Klambauer, G., Unterthiner, T., & Hochreiter, S. (2016). DeepTox: Toxicity Prediction using Deep Learning. Frontiers in Environmental Science, 3:80.
[Huang2016] Huang, R., Xia, M., Nguyen, D. T., Zhao, T., Sakamuru, S., Zhao, J., Shahane, S., Rossoshek, A., & Simeonov, A. (2016). Tox21Challenge to build predictive models of nuclear receptor and stress response pathways as mediated by exposure to environmental chemicals and drugs. Frontiers in Environmental Science, 3:85.
어떻게 얻었는지에 대한 방법이 안나와있는것 같다.
static features --> off-the-shelf software (Cao et al., 2013)
weight, Van der Waals volume, and partial charge information
dynamic features -->
The DeepTox pipeline uses JCompoundMapper (Hinselmann et al., 2011) to create dynamic features.
2013년, 2011년도 방법인데 ... 너무 구식 아닌가? 좀더 최근에 나온 방법은 없는가? --> 다른 논문을 찾아보도록하자.
Cao, D.-S., Xu, Q.-S., Hu, Q.-N., and Liang, Y.-Z. (2013). ChemoPy: freely available python package for computational biology and chemoinformatics. Bioinformatics 29, 1092–1094. doi: 10.1093/bioinformatics/btt105
Hinselmann, G., Rosenbaum, L., Jahn, A., Fechner, N., and Zell, A. (2011). jCompoundMapper: an open source Java library and command-line tool for chemical fingerprints. J. Cheminform. 3:3. doi: 10.1186/1758-2946-3-3
The Tox21 dataset in particular comprised several thousands of static features and hundreds of millions of dynamic features that were sparsely coded.
Supplementary section은 어디에 있는 것인가?
static features --> ChemoPy: freely available python package for computational biology and chemoinformatics
dynamic features --> jCompoundMapper: An open source Java library and command-line tool for chemical fingerprints
ChemoPy로 static features를 얻어내고
JCompoundMapper로 dynamic featrues를 얻어낸다.
우선 ChemoPy --> python2로 구현되어있음.
JCompundMappner -->
다른 논문에서는 어떻게?
github.com/filipsPL/tox21_dataset
www.frontiersin.org/articles/10.3389/fenvs.2015.00077/full
Descriptors Generation
For standardized data sets, two-dimensional molecular descriptors were calculated using KNIME nodes: RDKit (http://rdkit.org/, 117 descriptors), CDK (Beisken et al., 2013; http://sourceforge.net/projects/cdk/, 97 descriptors) and fingerprints [PubChem (881 bits) and MACCS (167 bits)], giving 1262 descriptors for each compound. For the list of used descriptors and literature references see Supplementary Table S5. For each target, Arff weka file was created using KNIME Arff Writer node.
'AI 독성예측' 카테고리의 다른 글
[AI 독성예측] 독성예측 툴 (0) | 2021.03.05 |
---|---|
[python] python으로 ipynb파일 수정하기 (0) | 2021.03.03 |
[AI 독성예측] rdkit에서 사용할 수 있는 molecular Descriptor에 대하여 (0) | 2021.02.26 |
[AI독성예측] ChemoPy : freely available python package for computationalbiology and chemoinformatics (0) | 2021.02.08 |
[AI독성예측]ToxCast Assay Data 기반 MLP 모델 개발 (0) | 2021.01.29 |