アミノ酸配列分析

関: amino acid sequence analyses、amino acid sequence analysis、amino acid sequence determination

WordNet

determine the order of constituents in; "They sequenced the human genome"
a following of one thing after another in time; "the doctor saw a sequence of patients" (同)chronological sequence, succession, successiveness, chronological succession
film consisting of a succession of related shots that develop a given subject in a movie (同)episode
serial arrangement in which things follow in logical order or a recurrent pattern; "the sequence of names was alphabetical"; "he invented a technique to determine the sequence of base pairs in DNA"
several repetitions of a melodic phrase in different keys
arrange in a sequence
an investigation of the component parts of a whole and their relations in making up the whole
the abstract separation of a whole into its constituent parts in order to study the parts and their relations (同)analytic thinking
a branch of mathematics involving calculus and the theory of limits; sequences and series and integration and differentiation
a form of literary criticism in which the structure of a piece of writing is analyzed
the use of closed-class words instead of inflections: e.g., `the father of the bride instead of `the brides father
any of a large group of nitrogenous organic compounds that are essential constituents of living cells; consist of polymers of amino acids; essential in the diet of animals for growth and for repair of tissues; can be obtained from meat and eggs and milk and legumes; "a diet high in protein"

PrepTutorEJDIC

〈U〉〈C〉(時間の上の,また因果関係のつながりによる)『連続』,続き / 〈C〉《a~》(…の)一連のもの《+『of』+『名』》 / 〈U〉(起こる)『順序』(order),筋道 / 〈C〉(…に対する)結果《+『to』+『名』》
(内容・状況などの)『分析』,分解;(詳細な)検討 / (化学・物理で)分析;《米》(心理学で)[精神]分析;(数学で)解析
蛋白(たんばく)質
配列,接続;(特に時間の)調整

Wikipedia preview

出典(authority):フリー百科事典『ウィキペディア（Wikipedia）』「2016/02/22 01:17:32」(JST)

wiki en

[Wiki en表示]

In bioinformatics, sequence analysis is the process of subjecting a DNA, RNA or peptide sequence to any of a wide range of analytical methods to understand its features, function, structure, or evolution. Methodologies used include sequence alignment, searches against biological databases, and others.^[1] Since the development of methods of high-throughput production of gene and protein sequences, the rate of addition of new sequences to the databases increased exponentially. Such a collection of sequences does not, by itself, increase the scientist's understanding of the biology of organisms. However, comparing these new sequences to those with known functions is a key way of understanding the biology of an organism from which the new sequence comes. Thus, sequence analysis can be used to assign function to genes and proteins by the study of the similarities between the compared sequences. Nowadays, there are many tools and techniques that provide the sequence comparisons (sequence alignment) and analyze the alignment product to understand its biology.

Sequence analysis in molecular biology includes a very wide range of relevant topics:

The comparison of sequences in order to find similarity, often to infer if they are related (homologous)
Identification of intrinsic features of the sequence such as active sites, post translational modification sites, gene-structures, reading frames, distributions of introns and exons and regulatory elements
Identification of sequence differences and variations such as point mutations and single nucleotide polymorphism (SNP) in order to get the genetic marker.
Revealing the evolution and genetic diversity of sequences and organisms
Identification of molecular structure from sequence alone

In chemistry, sequence analysis comprises techniques used to determine the sequence of a polymer formed of several monomers. In molecular biology and genetics, the same process is called simply "sequencing".

In marketing, sequence analysis is often used in analytical customer relationship management applications, such as NPTB models (Next Product to Buy).

In sociology, sequence methods are increasingly used to study life-course and career trajectories, patterns of organizational and national development, conversation and interaction structure, and the problem of work/family synchrony. This body of research has given rise to the emerging subfield of social sequence analysis.

History

Since the very first sequences of the insulin protein was characterised by Fred Sanger in 1951 biologists have been trying to use this knowledge to understand the function of molecules.^[2]^[3] According to Michael Levitt, sequence analysis was born in the period from 1969-1977.^[4] In 1969 the analysis of sequences of transfer RNAs were used to infer residue interactions from correlated changes in the nucleotide sequences giving rise to a model of the tRNA secondary structure.^[5] In 1970, Saul B. Needleman and Christian D. Wunsch published the first computer algorithm for aligning two sequences.^[6] Over this time developments in obtaining nucleotide sequence greatly improved leading to the publication of the first complete genome of a bacteriophage in 1977.^[7]

Sequence Alignment

Example multiple sequence alignment

There are millions of protein and nucleotide sequences known. These sequences fall into many groups of related sequences known as protein families or gene families. Relationships between these sequences are usually discovered by aligning them together and assigning this alignment a score. There are two main types of sequence alignment. Pair-wise sequence alignment only compares two sequences at a time and multiple sequence alignment compares many sequences in one go. Two important algorithms for aligning pairs of sequences are the Needleman-Wunsch algorithm and the Smith-Waterman algorithm. Popular tools for sequence alignment include:

Pair-wise alignment - BLAST
Multiple alignment - ClustalW, PROBCONS, MUSCLE, MAFFT, and T-Coffee.

A common use for pairwise sequence alignment is to take a sequence of interest and compare it to all known sequences in a database to identify homologous sequences. In general the matches in the database are ordered to show the most closely related sequences first followed by sequences with diminishing similarity. These matches are usually reported with a measure of statistical significance such as an Expectation value.

Profile comparison

In 1987, Michael Gribskov, Andrew McLachlan, and David Eisenberg introduced the method of profile comparison for identifying distant similarities between proteins.^[8] Rather than using a single sequence, profile methods use a multiple sequence alignment to encode a profile which contains information about the conservation level of each residue. These profiles can then be used to search collections of sequences to find sequences that are related. Profiles are also known as Position Specific Scoring Matrices (PSSMs). In 1993, a probabilistic interpretation of profiles was introduced by David Haussler and colleagues using hidden Markov models.^[9]^[10] These models have become known as profile-HMMs.

In recent years,^[when?] methods have been developed that allow the comparison of profiles directly to each other. These are known as profile-profile comparison methods.^[11]

Sequence assembly

Sequence assembly refers to the reconstruction of a DNA sequence by aligning and merging small DNA fragments. It is an integral part of modern DNA sequencing. Since presently-available DNA sequencing technologies are ill-suited for reading long sequences, large pieces of DNA (such as genomes) are often sequenced by (1) cutting the DNA into small pieces, (2) reading the small fragments, and (3) reconstituting the original DNA by merging the information on various fragment.

Gene prediction

Gene prediction or gene finding refers to the process of identifying the regions of genomic DNA that encode genes. This includes protein-coding genes as well as RNA genes, but may also include prediction of other functional elements such as regulatory regions. Gene finding is one of the first and most important steps in understanding the genome of a species once it has been sequenced. In general the prediction of bacterial genes is significantly simpler and more accurate than the prediction of genes in eukaryotic species that usually have complex intron/exon patterns.

Protein Structure Prediction

Target protein structure (3dsm, shown in ribbons), with Calpha backbones (in gray) of 354 predicted models for it submitted in the CASP8 structure-prediction experiment.

The 3D structures of molecules are of great importance to their functions in nature. Since structural prediction of large molecules at an atomic level is largely intractable problem, some biologists introduced ways to predict 3D structure at a primary sequence level. This includes biochemical or statistical analysis of amino acid residues in local regions and structural inference from homologs (or other potentially related proteins) with known 3D structures.

There have been a large number of diverse approaches to solve the structure prediction problem. In order to determine which methods were most effective a structure prediction competition was founded called CASP (Critical Assessment of Structure Prediction).^[12]

Methodology

The tasks that lie in the space of sequence analysis are often non-trivial to resolve and require the use of relatively complex approaches. Of the many types of methods used in practice, the most popular include:

DNA patterns
Dynamic programming
Artificial Neural Network
Hidden Markov Model
Support Vector Machine
Clustering
Bayesian Network
Regression Analysis
Sequence mining
Alignment-free sequence analysis

References

^ Durbin, Richard M.; Eddy, Sean R.; Krogh, Anders; Mitchison, Graeme (1998), Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids (1st ed.), Cambridge: Cambridge University Press, doi:10.2277/0521629713, ISBN 0-521-62971-3
^ Sanger F, Tuppy H (September 1951). "The amino-acid sequence in the phenylalanyl chain of insulin. I. The identification of lower peptides from partial hydrolysates". Biochem. J. 49 (4): 463–81. PMC 1197535. PMID 14886310.
^ SANGER F, TUPPY H (September 1951). "The amino-acid sequence in the phenylalanyl chain of insulin. 2. The investigation of peptides from enzymic hydrolysates". Biochem. J. 49 (4): 481–90. PMC 1197536. PMID 14886311.
^ Levitt M (May 2001). "The birth of computational structural biology". Nature Structural & Molecular Biology 8 (5): 392–3. doi:10.1038/87545. PMID 11323711.
^ Levitt M (November 1969). "Detailed molecular model for transfer ribonucleic acid". Nature 224 (5221): 759–63. Bibcode:1969Natur.224..759L. doi:10.1038/224759a0. PMID 5361649.
^ Needleman SB, Wunsch CD (March 1970). "A general method applicable to the search for similarities in the amino acid sequence of two proteins". J. Mol. Biol. 48 (3): 443–53. doi:10.1016/0022-2836(70)90057-4. PMID 5420325.
^ Sanger F, Air GM, Barrell BG, et al. (February 1977). "Nucleotide sequence of bacteriophage phi X174 DNA". Nature 265 (5596): 687–95. Bibcode:1977Natur.265..687S. doi:10.1038/265687a0. PMID 870828.
^ Gribskov M, McLachlan AD, Eisenberg D (July 1987). "Profile analysis: detection of distantly related proteins". Proc. Natl. Acad. Sci. U.S.A. 84 (13): 4355–8. Bibcode:1987PNAS...84.4355G. doi:10.1073/pnas.84.13.4355. PMC 305087. PMID 3474607.
^ Brown M, Hughey R, Krogh A, Mian IS, Sjölander K, Haussler D (1993). "Using Dirichlet mixture priors to derive hidden Markov models for protein families". Proc Int Conf Intell Syst Mol Biol 1: 47–55. PMID 7584370.
^ Krogh A, Brown M, Mian IS, Sjölander K, Haussler D (February 1994). "Hidden Markov models in computational biology. Applications to protein modeling". J. Mol. Biol. 235 (5): 1501–31. doi:10.1006/jmbi.1994.1104. PMID 8107089.
^ Ye X, Wang G, Altschul SF (December 2011). "An assessment of substitution scores for protein profile-profile comparison". Bioinformatics 27 (24): 3356–63. doi:10.1093/bioinformatics/btr565. PMC 3232366. PMID 21998158.
^ Moult J, Hubbard T, Bryant SH, Fidelis K, Pedersen JT (1997). "Critical assessment of methods of protein structure prediction (CASP): round II". Proteins. Suppl 1: 2–6. doi:10.1002/(SICI)1097-0134(1997)1+<2::AID-PROT2>3.0.CO;2-T. PMID 9485489.

UpToDate Contents

全文を閲覧するには購読必要です。 To read the full text you will need to subscribe.

1. プロテインS欠乏症：臨床症状および診断 protein s deficiency clinical manifestations and diagnosis
2. 第V因子ライデンおよび活性化プロテインC抵抗性：臨床症状および診断 factor v leiden and activated protein c resistance clinical manifestations and diagnosis
3. 肥大型心筋症の遺伝学 genetics of hypertrophic cardiomyopathy
4. アルツハイマー病の遺伝学 genetics of alzheimer disease
5. 成人における特発性細菌性腹膜炎：診断 spontaneous bacterial peritonitis in adults diagnosis

English Journal

Purification and characterization of a novel ubiquitin-like antitumour protein with hemagglutinating and deoxyribonuclease activities from the edible mushroom Ramaria botrytis.

Zhou R1, Han YJ1, Zhang MH1, Zhang KR1, Ng TB2, Liu F3.
AMB Express.AMB Express.2017 Dec;7(1):47. doi: 10.1186/s13568-017-0346-9. Epub 2017 Feb 22.
A novel ubiquitin-like antitumour protein (RBUP) was isolated from fruiting bodies of the edible mushroom Ramaria botrytis. The protein was isolated with a purification protocol involving ion exchange chromatography on DEAE-Sepharose fast flow and gel filtration on Sephadex G-75. SDS-PAGE, Native-PA
PMID 28229436

Discovery of Klotho peptide antagonists against Wnt3 and Wnt3a target proteins using combination of protein engineering, protein-protein docking, peptide docking and molecular dynamics simulations.

Mirza SB1,2, Ekhteiari Salmas R1, Fatmi MQ2, Durdagi S1.
Journal of enzyme inhibition and medicinal chemistry.J Enzyme Inhib Med Chem.2017 Dec;32(1):84-98. doi: 10.1080/14756366.2016.1235569. Epub 2016 Oct 21.
The Klotho is known as lifespan enhancing protein involved in antagonizing the effect of Wnt proteins. Wnt proteins are stem cell regulators, and uninterrupted exposure of Wnt proteins to the cell can cause stem and progenitor cell senescence, which may lead to aging. Keeping in mind the importance
PMID 27766889

Genomic RNA folding mediates assembly of human parechovirus.

Shakeel S1, Dykeman EC2, White SJ3, Ora A1,4, Cockburn JJ3, Butcher SJ5, Stockley PG6, Twarock R7.
Nature communications.Nat Commun.2017 Dec;8(1):5. doi: 10.1038/s41467-016-0011-z. Epub 2017 Feb 23.
Assembly of the major viral pathogens of the Picornaviridae family is poorly understood. Human parechovirus 1 is an example of such viruses that contains 60 short regions of ordered RNA density making identical contacts with the protein shell. We show here via a combination of RNA-based systematic e
PMID 28232749

Japanese Journal

Identification of a protein glycosylation operon from Campylobacter jejuni JCM 2013 and its heterologous expression in Escherichia coli(MICROBIAL PHYSIOLOGY AND BIOTECHNOLOGY)

Srichaisupakit Akkaraphol,Ohashi Takao,Fujiyama Kazuhito
Journal of bioscience and bioengineering 118(3), 256-262, 2014-09
… In this work, a protein glycosylation (pgl) operon conferring prokaryotic N-glycosylation in C. … jejuni NCTC 11168, with 98% and 99% identities in overall nucleotide sequence and amino acid sequence, respectively. … The pgl operon was heterologously co-expressed with model protein CmeA in the Escherichia coli BL21 ΔwaaL mutant. …
NAID 110009857354

Targeted gene integration using the combination of a sequence-specific DNA-binding protein and phiC31 integrase.

Nakanishi Hideyuki,Higuchi Yuriko,Yamashita Fumiyoshi,Hashida Mitsuru
Journal of biotechnology 186, 139-147, 2014-07-17
… To avoid endogenous gene disruptions, we aimed to enhance the integration site-specificity of the phiC31 integrase-based vector using a sequence-specific DNA-binding protein containing Gal4 and LexA DNA-binding motifs. … The dual DNA-binding protein was designed to tether the UAS-containing donor vector to the target sequence, the LexA operator, and restrict integration to sites close to the LexA operator. …
NAID 120005466731

エクソーム解析パイプラインの京コンピュータ上での大規模並列化

青山健人,角田将典,松崎由理,石田貴士,秋山泰
情報処理学会研究報告. MPS, 数理モデル化と問題解決研究報告 2014-MPS-98(33), 1-7, 2014-06-18
近年,全ゲノム配列のうちタンパク質を翻訳するエクソン領域のみを解析するエクソーム解析が可能となり,がんゲノム研究などに用いられている.また,シーケンシング技術の向上によってゲノム情報の蓄積は増加し続けており,さらに大規模な生命情報解析環境が求められている.本研究では汎用 PC クラスター上で動作するエクソーム解析パイプラインソフトウェア Genomon-exome を理化学研究所のスーパーコンピュ …
NAID 110009795511

「amino acid sequence determination」

　　[★]

アミノ酸配列決定、アミノ酸配列決定法

関: amino acid sequence analyses、amino acid sequence analysis、amino acid sequencing、protein sequence analysis

「アミノ酸配列分析」

　　[★]

英: amino acid sequence analysis、amino acid sequence analyses、protein sequence analysis
関: アミノ酸配列解析、アミノ酸配列決定法

「amino acid sequence analyses」

　　[★]

アミノ酸配列分析、アミノ酸配列解析

関: amino acid sequence analysis、amino acid sequence determination、protein sequence analysis

「amino acid sequence analysis」

　　[★]

アミノ酸配列分析、アミノ酸配列解析

関: amino acid sequence analyses、amino acid sequence determination、protein sequence analysis

「sequence」

　　[★]

n.

配列、連続、順序、結果、筋道、シークエンス

v.

配列決定する

関: a sequence of、arrange、arrangement、array、barrage、consecutive、consequence、constellation、continually、continue、continuous、order、ordinal、outcome、output、product、result、resultant、sequencing、sequential、serial、series、thread

「analysis」

　　[★]

n.

解析、分析、解析法、分析法

関: anal、analyse、analyses、analytical、analyze、assay、dissect、-metry、solve

「sequencing」

　　[★]

n.

配列決定、塩基配列決定、塩基配列決定法、シークエンシング

関: sequence

[1] Durbin, Richard M.; Eddy, Sean R.; Krogh, Anders; Mitchison, Graeme (1998), Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids (1st ed.), Cambridge: Cambridge University Press, doi:10.2277/0521629713, ISBN 0-521-62971-3

[pmid14886310-2] Sanger F, Tuppy H (September 1951). "The amino-acid sequence in the phenylalanyl chain of insulin. I. The identification of lower peptides from partial hydrolysates". Biochem. J. 49 (4): 463–81. PMC 1197535. PMID 14886310.

[pmid14886311-3] SANGER F, TUPPY H (September 1951). "The amino-acid sequence in the phenylalanyl chain of insulin. 2. The investigation of peptides from enzymic hydrolysates". Biochem. J. 49 (4): 481–90. PMC 1197536. PMID 14886311.

[pmid11323711-4] Levitt M (May 2001). "The birth of computational structural biology". Nature Structural & Molecular Biology 8 (5): 392–3. doi:10.1038/87545. PMID 11323711.

[pmid5361649-5] Levitt M (November 1969). "Detailed molecular model for transfer ribonucleic acid". Nature 224 (5221): 759–63. Bibcode:1969Natur.224..759L. doi:10.1038/224759a0. PMID 5361649.

[pmid5420325-6] Needleman SB, Wunsch CD (March 1970). "A general method applicable to the search for similarities in the amino acid sequence of two proteins". J. Mol. Biol. 48 (3): 443–53. doi:10.1016/0022-2836(70)90057-4. PMID 5420325.

[pmid870828-7] Sanger F, Air GM, Barrell BG, et al. (February 1977). "Nucleotide sequence of bacteriophage phi X174 DNA". Nature 265 (5596): 687–95. Bibcode:1977Natur.265..687S. doi:10.1038/265687a0. PMID 870828.

[pmid3474607-8] Gribskov M, McLachlan AD, Eisenberg D (July 1987). "Profile analysis: detection of distantly related proteins". Proc. Natl. Acad. Sci. U.S.A. 84 (13): 4355–8. Bibcode:1987PNAS...84.4355G. doi:10.1073/pnas.84.13.4355. PMC 305087. PMID 3474607.

[pmid7584370-9] Brown M, Hughey R, Krogh A, Mian IS, Sjölander K, Haussler D (1993). "Using Dirichlet mixture priors to derive hidden Markov models for protein families". Proc Int Conf Intell Syst Mol Biol 1: 47–55. PMID 7584370.

[pmid8107089-10] Krogh A, Brown M, Mian IS, Sjölander K, Haussler D (February 1994). "Hidden Markov models in computational biology. Applications to protein modeling". J. Mol. Biol. 235 (5): 1501–31. doi:10.1006/jmbi.1994.1104. PMID 8107089.

[pmid21998158-11] Ye X, Wang G, Altschul SF (December 2011). "An assessment of substitution scores for protein profile-profile comparison". Bioinformatics 27 (24): 3356–63. doi:10.1093/bioinformatics/btr565. PMC 3232366. PMID 21998158.

[pmid9485489-12] Moult J, Hubbard T, Bryant SH, Fidelis K, Pedersen JT (1997). "Critical assessment of methods of protein structure prediction (CASP): round II". Proteins. Suppl 1: 2–6. doi:10.1002/(SICI)1097-0134(1997)1+<2::AID-PROT2>3.0.CO;2-T. PMID 9485489.

リンク元	「amino acid sequence determination」「アミノ酸配列分析」「amino acid sequence analyses」「amino acid sequence analysis」
関連記事	「sequence」「analysis」「sequencing」

匿名

検索

案内

案内

protein sequence analysis