Phylogenetic study of Aliinostoc species (Cyanobacteria) using pc-igs , nifH and mcy as markers for investigation of horizontal gene transfer

: Selection of genes that have not been horizontally transferred for prokary - ote phylogenetic studies is regarded as a challenging task. Internal transcribed spacer of ribosomal genes (16S–23S ITS), microcystin synthetase genes ( mcy ), nitrogenase ( nifH ) and phycocyanin intergenic spacer ( PC-IGS ) are among the most used markers in cyanobacteria. The region of the ribosomal genes has been considered stable, whereas the nifH , mcyG and PC-IGS may have undergone horizontal transfer. To investigate the occurrence of horizontal transfer of nifH , mcyG and PC-IGS , phylogenetic trees of Aliinostoc strains Ay1375 and Me1355 were generated and compared. Phylogenetic trees based on the markers were mostly congruent for PC-IGS , indicating a common evolutionary history among ribosomal and phycocyanin genes with no evidence for horizontal transfer of PC-IGS . Phylogenetic trees constructed from the nifH and 16S rRNA genes were incongruent. Our results suggest that nifH has been transferred from one cyanobacterium to another. Moreover, the low non-synonymous/synonymous muta - tion ratio (Ka/Ks) was consistent with an ancient origin of the mcyG .


Introduction
The morphological characteristics of cyanobacteria do not always correspond to their taxonomic diversity (Komárek et al. 2016) and therefore the use of molecular markers for phylogenetic studies have become essential (Han et al. 2009).
Aliinostoc species is a cosmopolitan, nitrogen (N 2 )-fixing cyanobacterial species found in temperate to tropical freshwater or terrestrial habitats.The widespread proliferation of Aliinostoc species in paddy fields has increased the nitrogen in soils.Molecular approaches are particularly useful in the detection and identification of specific strains, especially those that are morphologically identical at the species level.Genetic identification can also be used to characterize the degree of genetic similarity among populations (Kabirnataj et al. 2020;Nowruzi et al., 2021;Nowruzi and Shalygin 2021).
One of the genes utilized for genetic differences between Aliinostoc cultures was nifH, a highly conserved gene that encodes dinitrogenase reductase, a protein subunit in the nitrogenase complex involved in N 2 fixation.Common to all N 2 fixers, the 324-bp nifH fragment is useful in characterizing diazotrophic communities and for differentiating cyanobacterial genera (Foster and Zehr, 2006).The other genetic locus used was cpcBA-IGS, which includes the highly variable intergenic spacer (IGS) region between two phycobilisome subunits (cpcB and cpcA) within the phycocyanin operon (Dyble et al., 2002;Brient et al., 2008).Both cpcA-IGS (Bastien et al., 2011) and nifH appear to be more useful in discriminating between strains than the commonly employed 16S rRNA gene, which exhibits low intrageneric variability in many cyanobacteria (Teneva et al., 2012).Moreover, microcystins, cyclic heptapeptide hepatotoxins, are by far the most prevalent of the cyanobacterial toxins and are produced by microcystin synthetase gene cluster (Jungblut et al. 2006;Nowruzi et al., 2022).
One of the greatest challenges in the selection of markers for phylogenetic studies in cyanobacteria is targeting markers that have not undergone horizontal gene transfer (HGT) (Yerrapragada and Siefert, 2009;Piccin-Santos et al., 2014).HGT and orthologous gene substitutions are relatively common among cyanobacteria and have been important processes in the evolution of this group (Piccin-Santos et al., 2014).However, HGT events in cyanobacteria may still be underestimated, and genes with several functions could have been subjected to this process (Zhaxybayeva et al. 2006).There is no reported evidence that the operons of ribosomal genes have undergone HGT among cyanobacteria.However, the variability observed among the multiple copies of the ribosomal operon found within a single individual can hinder their use in phylogenetic studies (Iteman et al. 2002).
The construction and comparison of phylogenetic trees are perhaps the best ways to assess the contribution of HGT to the evolutionary history of a gene family (Koonin et al., 2002).Incongruence is taken to indicate a role for HGT, whereas congruence is consistent with descent through common ancestry.Therefore, to resolve the relationship between microcystin synthetase genes, PC-IGS, nifH, 16S rRNA and the role of HGT in the evolutionary history, we undertook a molecular phylogenetic study.We analyzed and tested for congruence two data sets comprised of genes involved in primary metabolism and genes involved directly in the synthesis of microcystins and nodularins.
Our goal, using strains of Aliinostoc species as models, was to evaluate the possible occurrence of HGT by comparing phylogenetic trees built with mcy, PC-IGS, nifH and 16S rRNA.This is the first study to compare the different molecular markers in characterizing two Aliinostoc isolates originating from paddy fields of Iran.

Strains and cultivation conditions
The clonal and axenic strains (strain designations Ay1375 and Me1355) of Aliinostoc belonged to the Cyanobacteria Culture Collection (CCC) and ALBORZ herbarium.Strains were maintained in climate chambers with controlled conditions of continues light and temperature (25 ± 5°C) in BG-11 cultivation medium (Rippka et al. 1979), of pH value 7.4.

Molecular and sequence analysis
Genomic DNA was isolated from 16-18 days old log phase cultures using the Himedia Ultrasensitive Spin Purification Kit (MB505) following the instructions of the manufacturer, except the increase of incubation time for the lysis solutions AL and C1, which were set to 60 and 20 min, respectively.DNA fragments within the following genes were amplified using the oligonucleotide primers and PCR programs listed in Table 1: 16S rRNA gene, ITS, nifH, PC-IGS, mcyG and mcyD.PCR reactions were performed using a thermal cycler 5.9 and the following procedure: 25 µl aliquots containing 10-20 ng DNA template, 0.5 µM of each primer, 1.5 mM MgCl 2 , 200 µM dNTPs and 1U/µl Taq DNA polymerase (Robertson et al., 2001;Dyble et al., 2002;Nowruzi and Lorenzi, 2021).PCR products were analyzed by electrophoresis on 1% agarose gels (SeaPlaque® GTG®, Cambrex Corporation), using standard protocols.The products were purified directly using the Geneclean® Turbo kit (Qbiogene, MP Biomedicals) and sequenced using the BigDye® Terminator v3.1 cycle sequencing kit (Applied Biosystems, Life Technologies).
The partial sequences were compared with the ones available in the NCBI database (March, 2022) using BLASTn.The BLAST X tool (blast.ncbi.nlm.nih.gov/Blast.cgi) was used for cpcA-IGS, nifH, mcy D and mcyE genes.The sequences were annotated with the NCBI ORF Finder and the ExPASY (https:// www.ncbi.nlm.nih.gov/orffinder/)proteomics tools.

Nucleotide sequence accession numbers
Sequence data were deposited in the DNA Data Bank of Japan (DDBJ) under the accession numbers showed in Table 2.

Phylogenetic analysis
The 16S rRNA, ITS, cpcA-IGS, nifH, mcyG and mcyD genes sequences obtained in this study, as well as the best hit sequences (> 94% identity) retrieved from GenBank, were first aligned using MUSCLE (Edgar 2004), and then maximum likelihood phylogenetic trees were inferred in IQ-Tree (multicore v1.5.5) (Nguyen et al. 2015).Different models were used as suggested (BIC criterion) after employing model test implemented in IQtree (Table 2).Tree robustness was estimated with bootstrap percentages using 100 standard bootstrap and 10,000 ultrafast bootstrap to evaluate branch supports (Guajardo-Leiva et al. 2018).

16S-23S rRNA ITS region secondary structure analysis
The sequences corresponding to the D1-D1' helix, D2, D3, Box-B and Box-A regions of the 16S-23S ITS of the studied strains were characterized according to the Johansen et al. (2011), and trRNA Ile and trRNA Ala were determined according to the tRNAscan-SE 2.0 (Chan et al., 2021).Comparison of the ITS secondary structures of studied strains and the reference strains were generated using the M-fold web server (version 2.3) (Zuker 2003) under ideal conditions of untangled loop fix and the temperature set to default (37 °C).

Sequence divergence
We calculated the number of non-synonymous substitutions per non-synonymous site (Ka) and the number of synonymous substitutions per synonymous site (Ks) by using MEGA X (Nowruzi and Blanco, 2019).A Ka/Ks ratio >1 indicates positive selection for advantageous mutations, whereas a Ka/Ks ratio <1 indicates purifying selection to prevent the spread of detrimental mutations (Leikoski et al., 2009).

Phylogenetic analyses
Phylogenetic trees based on different gen markers are shown in Figs. 1 to 3. The nifH gene fragment and the fragment of the phycocyanin operon (cpcA-IGS) were amplified from both studied strains, however mcyG was only detected in Aliinostoc Me1355 strain.
The Aliinostoc phylogenetic trees based on the markers PC-IGS and 16S-23S ITS (Fig. 1) showed similar topologies.From the phylogenetic analysis based on 16 rRNA gene sequences, it is possible to observe that the studied strain is within a cluster composed by other Nostoc strains and its closest one is Nostoc_elgonense_TAU-MAC_0299 (MN062664).However, the phylogenetic trees obtained using nifH and 16S-23S ITS (Fig. 2) have differences in the branch positions of some strains.In the phylogeny based on the gene 16S-23S ITS, the studied strains were placed with Nostoc_calcicola_Ind32 (N216874) in the same cluster, However, when we look into the phylogeny based on nifH gene, the studied strains fall into separate clades and its closest one is Nostoc_sp.NQAIF320 (KJ636979), indicating that this gene probably could be the best marker for a high resolution at species level.
Box-B was nominated by a terminal bilateral bulge (A) and bilateral bulge (B).Box-B helix was not found for Aliinostoc magnakinetifex SA18.As to the Box-B + spacer, lengths varied from 39 nt (Aliinostoc morphoplasticum NOS) to 55 nt (Aliinostoc sp.SA46), with studied strains showing a length of 44 nt (Fig. 4) (Tab.5).

Number of nucleotides
Aliinostoc sp.strain Ay1375

Sequence divergences
Sequence divergences in the mcyG gene data set were much higher than expected in an evolutionary scenario, favoring recent horizontal gene transfer as a mechanism to explain the sporadic distribution of microcystin producers among cyanobacteria.To determine whether the mcyG gene is under positive or negative selection pressure, we compared the number of nonsynonymous substitutions per nonsynonymous site (Ka) to the number of synonymous substitutions per synonymous site (Ks).The Ka/Ks ratio was well below 1 in pairwise comparisons from representative strains of each genus.A low Ka/Ks ratio is indicative of purifying selection in which deleterious mutations affecting the protein sequence are selected against and is consistent with an ancient origin of the mcyG gene.

Discussion
HGT is relatively common among cyanobacteria, but it does not affect all genes in the same way.For some genomes, gene clusters have a lower probability of being transferred (Rantala et al., 2004).
The phylograms based on PC-IGS and mcyG were mostly congruent and no clear HGT signal was found for these genes, indicating a common evolutionary pathway for the phycocyanin, mcyG and ribosomal genes.This result is consistent with those of Sanchis et al. (2005) and Dadheech et al. (2010), who found that PC-IGS and 16S-23S ITS regions of Microcystis and Arthrospira strains also showed a high similarity between marker topologies.Phylogenetic analysis of this region was largely consistent with that obtained from 16S rDNA sequence analysis and revealed a relationship between the 16S rDNA sequence and the phycobilin content of cells.
However, phylogenetic trees constructed from the nifH and 16S rRNA genes were incongruent.Our results suggest that the nifH gene encoding the dinitrogenase reductase has been transferred from one cyanobacterium to another.However, the phylogenetic incongruence detected is likely to be a result of ancient horizontal transfers of the nifH biosynthetic genes since the sequence divergence of the dinitrogenase reductase genes was high.The main point of discordance between in the nifH phylogenetic tree was the location of two studied strains, in the phylogeny based on the gene 16S-23S ITS, they were placed in the same cluster, however into the phylogeny based on nifH gene, the studied strain falls into separate clades.In addition, it is noteworthy that in the 16S-23S ITS proposed phylogeny, there is a high Bayesian posterior probability to support its location.
Morphological studies showed that both studied strains were morphologically similar to each other, but that two of them formed an isolated clade in the nifH phylogram, indicating that despite the morphological similarity, they represent genetically divergent strains.Thus, the hypothesis that the divergence of the strains observed in the nifH tree could have been due to HGT was confirmed.
Moreover, our analyses do not corroborate the presence of HGT in PC-IGS and mcyG, but this event cannot be neglected as a hypothesis for explaining divergences in phylogenies.A study on the genome of Synechococcus spp.indicated that genes encoding phycocyanin may have evolved independently from genes of the core genome such as the allo-PC gene or the ribosomal regions (Six et al. 2007).
The search for more stable markers, not biased by HGT, has become essential for understanding the phylogeny and taxonomy of cyanobacteria (Gribaldo and Brochier 2009).The results presented herein strongly support nifH as a marker of choice for cyanobacterial phylogenetic studies and emphasize the importance of using multiple molecular markers to prevent erroneous conclusions based on HGT.

Summary
Horizontal gene transfer (HGT), potentially followed by recombination with or replacement of resident homologues, represents an important factor in the phylogeny of prokaryotic organisms such as cyanobacteria, and shapes their evolutionar history.Nowadays, HGT seems to be a major factor in species delimitation in cyanobacteria and plays a key selection pressure leading to cyanobacterial diversification.In this study, PC-IGS, nifH, mcyD, mycG and the ribosomal gene spacer 16S-23S ITS as molecular markers were compared to investigate the occurrence of horizontal transfer.The phylograms based on PC-IGS and mcyG were mostly congruent and no clear HGT signal was found for these genes.However, phylogenetic trees constructed from the nifH and 16S rRNA genes were incongruent.The exploration for more steady markers, not biased by HGT, has become important for detection of the phylogeny and taxonomy of cyanobacteria.

Four
reference sequences were used to search for ITS secondary structure.According to Johansen et al. (2011), nine different areas (D1-D1' helix, D2, D3, trRNA Ile , trRNA Ala , Box-B, Box-A and D4) were found in the ITS secondary structure of studied strain.The D1-D1' and Box-B regions of all studied strains were revealed to be very different in terms of length and shape (Fig. 4, Tab.4).

Table 5 :
Comparison of secondary structure of