|
Figure 1. Batch-learning self-organizing maps (BLSOMs) and normalized CG levels.(Ai, Aii) Mononucleotide (Mono) BLSOMs for 100-kb frog sequences. Nodes containing sequences from more than one species are indicated in black, and those containing sequences only from one species are indicated in color to distinguish species; nodes that do not contain sequnces after machine learning were left as blank (white) (Ai). When sequences of a single species occupy more than 50% at a node, the color indicating that species is given (Aii). (Bi, Bii) Dinucleotide (Di) BLSOMs for 100-kb sequences and 1-Mb sequences sliding with a 100-kb step, respectively. Nodes are marked as described in Ai. (C) The contribution level of each dinucleotide to each node is visualized by a color: pink (high), white (moderate), and green (low); results of all dinucleotides are presented in Fig S2. (D) Normalized CG levels of 25 vertebrates are arranged in the descending order: fishes (blue), frogs (reddish brown), reptiles (violet) and mammals (black).
|
|
Figure S1. The phylogeny and assembled genome sizes of seven frogs used in this study.The phylogenetic relationship of the frogs is derived from the tree on AmphibiaWeb (https://amphibiaweb.org), and the divergence time scale shown at the bottom is based on times estimated using TIMETREE (http://www.timetree.org). The genome size of each frog was estimated in this study (Glandirana rugosa) and in previous studies of other groups: Denton et al (2018)
Preprint (Pyxicephalus adspersus), Seidl et al (2019) (Spea multiplicata), Li et al (2019) (Leptobrachium leishanense), Hellsten et al (2010) (Xenopus tropicalis), and Session et al (2016) (Xenopus laevis). The assembled genome size of Rhinella marina was estimated to be in the range of 1.98â2.38 Gb (Edwards et al, 2018), and the average of the range is shown here.
|
|
Figure S2. The contribution level of each dinucleotide to each node is visualized on the dinucleotide batch-learning self-organizing map by a color: pink (high), white (moderate), and green (low) as described in Fig 1C.
|
|
Figure S3. Normalized CG levels of 45 vertebrates are arranged in the descending order: fishes (blue), frogs (reddish brown), reptiles (violet), and mammals (black).
|
|
Figure 2. Distribution of CG composition in 1-Mb windows sliding with a 100-kb step on six chromosomes of four frogs.(A)
Pyxicephalus adspersus. (B)
Xenopus tropicalis. (C)
Xenopus laevis. (D)
Leptobrachium leishanense. The binomial approximation is shown as a dashed line. In the graphs the x-axe shows positions in each chromosome (Mb), and the y-axe shows CG composition (%).
|
|
Figure S4. Distribution of CG compositions in 1-Mb windows sliding with a 100-kb step on chromosomes of four frogs.(A)
Pyxicephalus adspersus. To examine difference of the sex chromosome (chrZ) from autosomes, the binomial approximation is presented as a dashed line. (B)
Xenopus tropicalis. (C)
Xenopus laevis. (D)
Leptobrachium leishanense.
|
|
Figure 3. Distribution of the normalized CG (blue) and the CG/GC ratios (orange) on the Pyxicephalus adspersus chromosomes.
|
|
Figure 4. Trinucleotide (Tri) batch-learning self-organizing maps for 1-Mb sequences sliding with a 100-kb step.(Ai) Nodes are marked as described in Fig 1Ai. (Aii) Sequences of Pyxicephalus adspersus and Xenopus laevis are displayed on the map, and their small satellite territories are marked by arrows. (Aiii) The ratio of occurrence of each of 10 trinucleotides in each satellite of P. adspersus to that in the entire genome is presented by a vertical colored bar: red (Psz1), green (Psz2), and violet (Psz3). Trinucleotides are arranged in the descending order of the ratio in Psz1, Psz2, and Psz3. For the Psz3 satellite, AGG+CCT was included in its top 10 instead of ACC+GGT, but the result of ACC+GGT is presented for comparison with other satellites. (B) The contribution level of each trinucleotide to each node on the map is visualized as described in Fig 1C; results of all trinucleotides are presented in Fig S6.
|
|
Figure S5. The contribution level of each trinucleotide to each node on the trinucleotide batch-learning self-organizing maps is visualized as described in Fig 3B.
|
|
Figure 5. Distribution of CCG+CGG composition in 1-Mb windows sliding with a 100-kb step on the Pyxicephalus adspersus chromosomes.Chromosomal locations of sequences for Psz1, Psz2, and Psz3 are indicated by thick horizontal lines in red, green and violet on and above the x-axis. In the graphs, the x-axis shows positions in each chromosome (Mb), and the y-axis shows CCG+CGG composition (%).
|
|
Figure 6. Batch-learning self-organizing maps (BLSOMs) for 1-Mb sequences sliding with a 10-kb step.(Ai, Bi, Ci) Di-, tri-, and tetranucleotide (Tetra) BLSOMs, respectively. (Aii, Bii, Cii) All sequences of Glandirana rugosa and sequences derived from sex chromosomes of other frogs are displayed: chrW (dark brown) and chrZ (light brown) of Pyxicephalus adspersus, chr7 of Xenopus tropicalis (light blue), and chr2L of Xenopus laevis (dark blue). Satellite territories of P. adspersus (dark brown) around the G. rugosa territory are marked by arrows. (Aiii, Biii, Ciii) All sequences of G. rugosa and P. adspersus are displayed. The black nodes containing both P. adspersus and G. rugosa sequences are specified by Di1 and Di2 on the di- and trinucleotide BLSOMs (Aiii and Biii, respectively), and sequences belonging to these black nodes were extracted. The G. rugosa nodes in the vicinity of chrW are marked by dark brown (Tet1; Ciii) and sequences belonging to the nodes were extracted.
|
|
Figure 7. Dot plot analyses between chrW in Pyxicephalus adspersus and two scaffold sequences in Glandirana rugosa.(A, B) Each x-axis is the position of chrW, and the y axes are locations of (A) scaffold2393955 and (B) scaffold4079606. Blue arrowheads indicate the position of a series of dots in 1.5, 4.5, 17, and 18 Mb on chrW.
|
|
Figure S6. Dot plot analyses between chrW in Pyxicephalus adspersus and two scaffold sequences in Glandirana rugosa.(A, B) Each x-axis is the position of chrW, and the y axes are locations of (A) scaffold4871719 and (B) scaffold959090. Blue arrowheads indicate the position of dots in 1.5, 4.5, 17, and 18 Mb on chrW.
|
|
Figure S7. Distribution of core elements of transcription factor binding sequences in 1-Mb windows sliding with a 100-kb step on chromosomes of Pyxicephalus adspersus.For reference, CG distribution is presented.
|
|
Figure S8. Distribution of core elements of transcription factor binding sequences in 1-Mb windows sliding with a 100-kb step on chromosomes of Xenopus laevis.
|
|
Figure S9. Tetranucleotide (Tetra) batch-learning self-organizing map (BLSOM) for 1-Mb sequences sliding with a 10-kb step.(A) This BLSOM is the same to that presented in Fig 5Ci. (B) If a node of the BLSOM contains sequences derived from large or small chromosomes of Xenopus laevis, it is shown in dark brown or pink, respectively. If a lattice point contains both chromosomes, it is shown in black. The X. laevis sequences located at the margins of a Xenopus tropicalis territory were mainly derived from the small chromosome, whereas those within the Pyxicephalus adspersus territory were all derived from the large chromosome.
|
|
Figure S10. The occurrence ratios of repetitive sequences in the Glandirana rugosa genome.
|