A new method is paving the way for the study of repeat sequence proteins


Some diseases – such as Huntington’s disease - are caused by proteins with an abnormally high number of repeats of the same amino acids. The protein structure adopted in these repeats is disordered and not accessible to current analysis tools. But in Montpellier, Pau Bernadó’s team has just developed a new method that overcomes this limitation. This study paves the way for a clearer understanding of Huntington’s disease, as well as studies in other hereditary diseases associated with this type of phenomenon.

Proteins are made up of a combination of 20 amino acids. In eukaryotes, approximately 15% of the proteins present sequences composed of the repetition of a single amino acid. These repeats give the proteins a certain amount of plasticity but, in a small number of cases, the repeated parts stretch. This causes the protein concerned to aggregate and malfunction.

Hence Huntington’s disease is caused by the presence of a protein - huntingtin - presenting an abnormally high number of repeats of the amino acid glutamine. Usually glutamine is repeated around 20 times in this protein. But a defect in DNA replication can cause the number of glutamines to increase from one generation to the next. Huntington’s disease develops when the number of repeats exceeds 35 and the higher the number of glutamines the earlier the disease develops: an individual carrying a huntingtin protein presenting 38 glutamine repeats will develop the disease at a very advanced age, whereas in juvenile forms of the disease the protein is characterized by 55 or even up to 100 repeats of the amino acid.

A pathological threshold for the number of repeats

“35 repeats appears to be the threshold above which the protein behaves abnormally,” explains Pau Bernadó from the Structural Biochemistry Center* in Montpellier. “Our hypothesis is that beyond this limit, the protein changes structure and its function is altered. However, there is a methodological obstacle preventing us from testing this hypothesis: the tools currently available to us are not capable of studying this type of protein.” This is because the polyglutamine part of the protein does not fold; it remains disordered and is not accessible to the crystallography methods conventionally used to determine the 3D structure of proteins. Another method - nuclear magnetic resonance (NMR) spectroscopy - provides access to the structure of unfolded proteins: it is deduced from the resonance frequency of the atoms, this frequency being specific to each atom and varying as a function of its close environment. However, this method is not very useful in the case of repeated sequences since all the amino acids are identical: all the glutamines appear on a very small region of the spectrum, making its attribution difficult.

A pathological threshold for the number of repeats

Close-up of the part of the NMR spectrum of the huntingtin protein containing the polyglutamine sequence. The method developed by the Pau Bernadó team enables the glutamines (noted Q) to be individually marked and identified within an otherwise undifferentiated set (in gray). Each peak corresponds to an amino acid.

It was to develop a methodology that could be used to study the structure of normal and pathological huntingtins that Pau Bernadó was given funding by the European Research Council (ERC). It took two years of research before the proof of concept was published in March in the Angewandte Chemie International Edition journal. This researcher and his team combined two methods (in vitro protein synthesis and nonsense codon suppressor tRNA) to individually mark each glutamine and thus be able to identify them on NMR spectra. By reiterating the operation on each glutamine present, it is possible to obtain the structure of the polyglutamine part and study the impact of the number of repeated glutamines on this structure.

A window on the world of repeat sequence proteins

This new methodology opens up numerous opportunities for studying proteins with repeats, as yet little documented. Glutamine repeats are the cause of at least ten other neurodegenerative diseases, such as Kennedy’s disease and other neuromuscular diseases, with very similar pathological repeats. In addition, many proteins including amino acid repeats are involved in significant biological functions. Gaining a clearer insight into the still opaque world of repeat proteins will help us better understand their behavior and envisage new treatment strategies.


* unit 1054 Inserm/CNRS/Université de Montpellier, Structural Biochemistry Center (CBS), Structure and function of highly flexible proteins team, Montpellier


A Urbanek et al. A general strategy to access structural information at atomic resolution in polyglutamine homorepeats. Angew Chem Int, online edition of March 7, 2018