Discovery of molecular markers and development of Database for Ricebean
Loading...
Date
2024
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
DIVISION OF AGRICULTURAL BIOINFORMATICS ICAR-INDIAN AGRICULTURAL STATISTICS RESEARCH INSTITUTE GRADUATE SCHOOL INDIAN AGRICULTURAL RESEARCH INSTITUTE NEW DELHI – 110012
Abstract
ABSTRACT
Ricebean (V. umbellata) is an annual legume cultivated during the Kharif season. Its seeds are
harvested and consumed as a pulse, contributing to the nutritional needs of communities.
Although Ricebeans are a valuable crop, they are considered minor legumes due to their limited
cultivation areas. It is commonly grown as an intercrop with maize and sorghum, maximizing
land use and enhancing agricultural diversity. The cultivation of Ricebeans is prominent in the
Northern part of India, mainly in Uttarakhand, as well as in the North-eastern part of India,
particularly in Assam. One of the significant benefits of Ricebeans lies in their seed composition.
The seeds contain a good amount of protein and other essential nutrients, making them a
valuable source of sustenance for people. The protein content in Ricebean seeds supports dietary
requirements, especially in regions where protein-rich food sources are vital for nutritional wellbeing.
Despite being categorized as a minor legume, Ricebeans play an essential role in
enhancing crop diversity, providing nutritious food, and supporting sustainable agricultural
practices in specific regions of India. Their ability to thrive in certain geographical locations
makes them a valuable crop for local communities. In this research, Out of 511010 transcripts it
has been found that in the Ricebean seed and leaf transcriptome, there are a total of 61905 SSRs,
79359 Primers for those sequences, 172749 amplicons, and 79358 Loci. Dinucleotide repeats
were predominant with an absolute proportion of 76.5%, followed by trinucleotides (20.8%) and
tetra nucleotides (1.7%). Other types totaled 1%. Out of the dinucleotide motif, AT accounted for
17.5%, followed by TA (16.9%), TC (9%), and others (33.1%). The group motifs AT/TA
accounted for 17.5% of the total, followed by TA/AT (16.9%), and all others accounted for
65.6%. The 10-bp-long SSR motif accounted for 33.1%, followed by the 15-bp-long SSR motif
(12.3%), and all others accounted for 54.6%. A total of 196222 SNPs were identified, and the
highest quality was 999. It has been found that there are nearly 2.38 average amplicons for
mapped markers, and total markers mapped to input sequences are 17260. There are 13285 total
sequences with markers mapped. A total of 17260 unique markers were found. Out of 79358
57 | P a g e
loci, the totalSSR loci and percentage with primer pairs designed are 66208 and
83.4295219133547% respectively, and the total SSR loci and percentage without primer pairs
designed are 13150 and 16.5704780866453% respectively. There are a total of 196221 SNPs
present in the Ricebean leaf and seed transcriptome. Here we present the first whole genomebased
molecular marker database for Ricebean, RbMmDb (Ricebean Molecular Marker
Database), for this research as well as existing data on the internet. RbMmDb shows the data for
the SSR: primer and amplicon mined in silico using the GMATA tool, and SNPs mined in-silico
using the bcftool and samtool. RbMmDb, a user-friendly and freely accessible tool, offers motifwise
SSR as well as location-wise SNPs. It is an online relational database based on "three-tier
architecture" that catalogs information about molecular markers in MySQL and has a userfriendly
interface developed using PHP (Hypertext Preprocessor).