The Arabidopsis community has always been very open, so today researchers and funding bodies can look back on more than 20 years of strong international collaboration and data sharing. The efforts of the Arabidopsis community have always been guided by decadal plans, which alongside led to the establishment of many Arabidopsis community projects and resources:
- The Arabidopsis genome research project (1990-2001) led to the completion of the Arabidopsis genome. During this decade two out of three stock and resource centers ABRC (Arabidopsis Biological Resource Center, US) and NASC (Nottingham Arabidopsis Stock Center, UK) were founded.
- The Multinational Coordinated Arabidopsis thaliana Functional Genomics Project (2002-2011) led to the functional annotation of most of the Arabidopsis thaliana genes. Alongside, The Arabidopsis Information Resource (TAIR) was founded in 2001 to meet the needs of the growing Arabidopsis research community.
- From Bench to Bountiful Harvests (2012-2021) aims to obtain in-depth knowledge of how the genome is translated into a continuum of processes, from the single molecule to cells and tissues, the whole plant, plant populations, and fields of plants, to be able to build a predictive model of an Arabidopsis plant. In order to provide a flexible platform to enable open sharing of the vast amount of data generated by today’s omics approaches, the International Arabidopsis Informatics Consortium (IAIC) founded the Arabidopsis Information Portal in 2013 (Araport).
The directors of Arabidopsis community projects and resources have been contributing to the MASC reports for several years, by presenting their respective goals, progress and news. Since 2014, general plant projects and resources have also been included, reflecting the growing connections between researchers focussing on different plant species.
The most recent 2023-24 Project Reports can be downloaded here. If you require individual project reports then please contact the MASC coordinator This email address is being protected from spambots. You need JavaScript enabled to view it..
Resource and Stock Centers
-
Arabidopsis Biological Resource Center (ABRC) Open or Close
David Somers, This email address is being protected from spambots. You need JavaScript enabled to view it., ABRC Director Emma Knee, This email address is being protected from spambots. You need JavaScript enabled to view it., ABRC Associate Director
August 6th 2020
lthough demand for ABRC resources is still strong, there has been a downward trend in orders over the past 10 years. In the past 5 years, orders have stabilized indicating that there is still a base level of demand for these types of resources. Seeds and clones of Arabidopsis make up the bulk of stock distribution with the proportion of seed orders increasing from 84% in 2009 to 90% of total orders in 2019.
Resource distribution of other Brassicaceae is less than 5% of the total. In 2019, almost 50% of stock distribution was to US researchers, up from 40% five years ago. This indicates that demand for Arabidopsis seed resources in the US is robust, suggesting a strong Arabidopsis research community.
Recent activities and newly developed tools and resourcesABRC launched a new web site (https://abrc.osu.edu/) in May 2019. The site has separate pathways to information for researchers and educators including stock data, ordering, payment and donation information as well as general information about Arabidopsis and ABRC. We continue to collaborate with The Arabidopsis Information Resource (TAIR) to make stock information available through the TAIR web site (https://www.arabidopsis.org/) with links out to ABRC.
Direct links have also been established from the SALK Institute Genomic Analysis Laboratory (http://signal.salk.edu/) T-DNA Express tool, 1001 Genomes (https://1001genomes.org/) web site, and the European Arabidopsis Stock Center, NASC (http://arabidopsis.info/), web site allowing users to leverage search functionality available through these resources and link directly to ABRC for stock ordering.
In 2019 ABRC distributed close to 190,000 samples to 2,260 individuals located in 47 countries. We also provided bulk seed for 2,300 seed lines to NASC and an additional 4,200 samples of seed for distribution to NASC users, where we were not able to provide bulk seed. The seed collection is now composed of more than 536,000 stocks.
Most of the A. thaliana seed stocks donated in 2019 were characterized mutant lines. These include a collection of embryo defective mutants donated by D. Meinke (Oklahoma State), added to a collection of chloroplast mutants recently received from R. Last (U. of Michigan). Both are part of our ongoing drive for new “legacy” stocks from retiring or re-orienting researchers for which we were recently funded (NSF CSBR). Diverse members of the Brassicaceae including accessions of Arabis alpina, Thlaspi arvense, Brassica rapa, Erysimum cheiranthoides and Caulanthus amplexicaulis were also received.
The non-seed portion of our collection now numbers more than 460,000 stocks. These lines include individual clones and libraries from Arabidopsis thaliana, and other members of the Brassicaceae, as well as constructs, host strains, antibodies, cell lines and education resources.
Non-seed resources added to the collection in 2019 include the JAtY BAC library generated by I. Bancroft, transcription factor ORF clones in bait and prey vectors donated by J. Ecker, maize clones donated by E. Grotewold and constructs from various donors. ABRC has performed quality control testing for 4,395 new and existing stocks, involving either germination testing or verifying stock identity.
Planned future activities of your project or resource
ABRC will continue to solicit donations of Arabidopsis seed resources and to expand the stock collection to include new resources for Arabidopsis and other Brassicaceae. Distribution is expected to continue at the current levels for most resources. Demand for education resources will likely increase following trends of the past 5 years. Quality control testing of new donations and stocks reproduced at ABRC will be carried out at similar levels to 2019.
ABRC and NASC collaboration via exchange of seed stock resources and related data will also continue. Ongoing development of our web site will include improvements to the user experience and administrative functions as well as addition of an application programming interface (API) to allow easy access to stock data. ABRC outreach continues to work with local community partners, the Ohio State University (OSU), and the broader plant science research and teaching communities to support education initiatives, especially those utilizing Arabidopsis resources.
Conferences, Workshops and Training events
At ICAR 2019 in Wuhan, China, ABRC organized a booth in conjunction with NASC, and Dave Somers gave a presentation on the new web site in the Bioinformatics for plant research workshop. ABRC also participated in a data resources booth at Plant Biology 2019 in San Jose, CA.
Emma Knee gave a general presentation on ABRC as part of the US Culture Collection Network virtual meeting in November 2019. The ABRC advisory committee meeting was held at OSU, with a full day of presentations by ABRC staff members and a tour of the facilities. ABRC outreach and education ran booths at two local events, the Science Education Council of Ohio’s annual conference and the National Science Teaching Association’s regional conference in Cincinnati, OH.
In 2020 ABRC and NASC will organize a booth together at the 31st ICAR in Seattle, WA and will again participate in the data resources booth at Plant Biology 2020 in Washington, DC [ICAR2020 now delayed until 2021 and Plant Biology is going virtual: Ed]. ABRC outreach will participate in the Science Education Council of Ohio’s annual conference and The Advancing Research Impact in Society Broader Impacts summit in Durham, NC.
-
Nottingham Arabidopsis Stock Center (uNASC) Open or Close
Sean May, Director
This email address is being protected from spambots. You need JavaScript enabled to view it.Marcos Castellanos-Uribe, Operations Manager.
This email address is being protected from spambots. You need JavaScript enabled to view it.">This email address is being protected from spambots. You need JavaScript enabled to view it.
August 6th 2020
In 2019 we sent over 100,000 tubes of seed worldwide to 43 countries. This year’s top receiving countries [previous 2018 position], are: 1st - China [3], 2nd - Germany [2] , 3rd - Spain [6] , 4th - Japan [4] , 5th - France [5] , and 6th the UK [1] . The biggest donor of stocks by far is still Germany.
The COVID-19 event saw a decline in the numbers of stocks ordered from NASC during 2020. This was particularly noticeable with respect to China during February and then some European labs from March as individual groups and departments ramped down their work. That said, orders from some countries such as Germany did not noticeably slow, and many institutes worldwide are still open as of April 2020.
Most noteworthy was the dramatic resurgence in orders (>2,000 stocks) from China during the first week of April 2020. Our exhibition stand in Wuhan at ICAR2019 was extremely well attended and demonstrated the strength and expansion of Arabidopsis research in Asia, particularly in China itself.
For up-to-date details on stock donations or anything else that you wish to know, please do visit the NASC site, or contact This email address is being protected from spambots. You need JavaScript enabled to view it. at any time. If we (NASC and ABRC) can help you or promote your research to the community by distributing seed on your behalf, then please do contact us - don’t wait for us to come to you.
See you at ICAR2021 in Seattle.
-
RIKEN BioResource Center (BRC) Open or Close
Masatomo Kobayashi (RIKEN coordinator)
August 6th 2020
Arabidopsis has been significant plant species for plant science in Japan. The word “Arabidopsis” is shown in 74 titles out of 563 presentations in the 83rd Annual Meeting of the Botanical Society of Japan (BSJ Sep. 2019, Sendai). This model plant has been getting attention due to a novel which was serially published in a national newspaper. Shion Miura, one of the popular female writers, depicts a laboratory life of a young heroine who studies Arabidopsis mutants. Many people have learned how is the laboratory works in plant science. The novel won the BSJ Special Prize
Recent activities and newly developed tools and resources of your project or resource.
RIKEN BRC has joined a project “RIKEN Integrated Symbiology (iSYM)” and started researches on plant-microbe symbiosis. Model plants such as Arabidopsis and Brachypodium are used in the project.
https://www.riken.jp/en/research/labs/isym/
https://www.yokohama.riken.jp/isym/index.htmlPlanned future activities of your project or resource
A series of binary vectors deposited from Dr. Tsuyoshi Nakagawa, Shimane University, is added to our Exp-Plant Catalog soon.
http://shimane-u.org/nakagawa/gbv.htmWe are planning to add Arabidopsis Transcription Factor – Glucocorticoid Receptor (TF-GR) lines to our catalog. The lines are developed at Dr. Minami Matsui laboratory in RIKEN CSRS.
https://www.embopress.org/doi/full/10.15252/msb.20177840Conferences, Workshops and Training events
Riken BRC hosts the 13th Asian Network of Research Resource Centers (ANRRC). This International Meeting is held in November,2020.
Arabidopsis Informatics and Data Sharing Resources
-
International Arabidopsis Informatics Consortium (IAIC) Open or Close
By Blake C. Meyers (Interim Director) and Joanna D. Friesner (Assistant), http://www.arabidopsisinformatics.org/.
August 6th 2020
Recent activities and newly developed tools and resources.
The Arabidopsis community, and other scientific communities that use Arabidopsis resources in their work, rely on the publicly-shared community resources developed over the past several decades. Valuable resources accessed by researchers, both in the public and private sectors, include reference genomic sequence data and newer resources. Many of these have been generated by the community, and at an increasing rate, as technological advances and subsequent reductions in costs to generate data sets, have led to a rapid increase in the number and type of data sets in the public sphere. In 2010, the IAIC was formed by the North American Arabidopsis Steering Committee (NAASC) in response to the announcement of the planned termination of federal funding for The Arabidopsis Information Resource (TAIR); TAIR had been the primary publicly-accessible online Arabidopsis database since its inception in 1999 and had received continuous funding by the US National Science Foundation since its founding. The international Arabidopsis community, represented by NAASC and the Multinational Arabidopsis Steering Committee (MASC), convened workshops to strategize how best to continue the vital services that TAIR had provided, and to ensure continuity and availability of community-generated data and resources (1).
IAIC’s initial focus was to promote the collaborative development of a new bioinformatics resource, later named ‘Araport’, which was conceived through a ‘Design Workshop’ in 2011 (2). The intent was that Araport would serve as the underlying infrastructure for Arabidopsis informatics resources by interacting and linking with resources developed and housed by others, e.g. by linking with data sets generated in individual laboratories located around the world. A key component envisioned for Araport’s success was that community-generated resources, tools, and data sets would be linked dynamically to Araport such that the global community could provide, support, update, and access the shared resources. This democratization of workload, expertise, innovation, and financial commitment was intended to enable Araport’s sustainability and promote creativity and interaction amongst groups that generate and use tools and datasets. Concurrent with Araport’s design and development, TAIR became sustainable via a not-for-profit organization, Phoenix Bioinformatics, which allowed the database to continue while TAIR staff refocused on annotation and improvements to the database, all funded through a subscription service. TAIR and Araport had thus co-existed in a complementary manner, the former emphasizing functional annotation, the latter on aggregating resources.
An IAIC workshop entitled “2018 - the Future of Arabidopsis Bioinformatics”, was held in May, 2018 to evaluate the status of Arabidopsis informatics and chart a course for future research and development. In advance of the meeting, organizers solicited input from the broader community via MASC, who distributed an online survey of plant bioinformatic needs (3).
The workshop focused on several challenges, including the need for reliable and current annotation, community-defined common standards for data and metadata, and accessible and user-friendly repositories/tools/methods for data integration and visualization. Solutions envisioned included (a) a centralized annotation authority to coalesce annotation from new groups, establish a consistent naming scheme, distribute this format regularly and frequently, and encourage and enforce its adoption; (b) community-established guidelines and standards for data and metadata formats; (c) a searchable, central repository for analysis and visualization tools. Improved versioning and user access to make tools more accessible.
Finally, workshop participants proposed a “one-stop shop” website, an Arabidopsis “Super-Portal” to link tools, data resources, programmatic standards, and best practice descriptions for each data type, while emphasizing such a portal must have community buy-in and participation in its establishment and development to encourage adoption. The 2018 IAIC workshop participants produced a white paper outlining the current state, challenges, and priorities for the future of Arabidopsis bioinformatics resources (4). Most recently, after several unsuccessful NSF grant renewal applications, funding of the Araport project has been discontinued (see below).
(1) https://doi.org/10.1105/tpc.110.078519
(2) https://doi.org/10.1105/tpc.112.100669
(3) http://arabidopsisresearch.org/images/publications/documents_articles/2018_MASC_BioinfoSurvey.pdf
(4) https://doi.org/10.1002/pld3.109Planned future activities.
The IAIC’s funding is nearly expired and thus its associated activities associated are winding down. IAIC’s major focus was on enable community development of Araport to replace and augment TAIR. Araport.org was established by PI Chris Town and colleagues and had been funded by NSF since its inception. However, after several recent unsuccessful NSF grant renewal applications, funding of project has been discontinued. Teams from The Arabidopsis Information Resource (TAIR), the National Center for Genome Resources (NCGR), and the Bio-Analytic Resource for Plant Biology (BAR) have taken over its operation and have refreshed/expanded the functionalities that were available at Araport.
Conferences, Workshops and Training events
The IAIC held a community workshop on January 13, 2020, entitled “Arabidopsis Bioinformatics” at the Plant and Animal Genomes (PAG) XXVIII conference in San Diego. The workshop featured speakers from all over the world, starting with Yijing Zhang, from the Shanghai Institutes for Biological Sciences, who shared her group’s work in assembling the Plant Regulomics resource, a data-driven interface for retrieving upstream regulators from plant multi-omics data.
Korbinian Schneeberger from MPIPZ in Germany talked about their chromosome-level assemblies of seven Arabidopsis thaliana ecotype genomes and what the analysis of the data revealed about the evolution and total gene complement of this species. Andrew Farmer from the National Center for Genome Resources in New Mexico spoke about the deployment of a Genome Context Viewer (GCV) displaying the aforementioned Jiao/Schneeberger genomes, additional Arabidopsis ecotypes, and related Brassicaceae genomes, and the usefulness of GCV in exploring both macro- and micro-syntenic regions.
Larry Wu from Michigan State University described the identification of upstream, overlapping, and upstream overlapping ORFs (uORFs, oORFs, uoORFs) in Arabidopsis using the RiboSeq technique. The last two speakers, Sylva Donaldson from the BAR at the University of Toronto and Shabari Subramaniam from TAIR at Phoenix Bioinformatics in California talked about the transition of both Thalemine and JBrowse from the old Araport site to being hosted by BAR and TAIR, respectively, using the latest versions of the software.
Both tools are available to the community with updated and newly incorporated datasets. In their two talks, Sylva and Shabari also shared additional developments and new features at BAR and TAIR.
IAIC expected to hold a final workshop at ICAR 2020, scheduled for this July at the University of Washington, Seattle. However, due to the unexpect novel COVID-19 outbreak, ICAR 2020 has been postponed to June 2021. It is unclear if the final IAIC workshop will take place during ICAR 2021.
Additional Information
The IAIC, and this material, are based upon work supported by the National Science Foundation under award #1062348. Any opinions, findings, and conclusions or recommendations expressed in this event, or in resulting work, are those of the participants and do not necessarily reflect the views of the National Science Foundation.
-
The Arabidopsis Information Portal (Araport) Open or Close
By Chris Town (Principal Investigator), www.araport.org.
The Araport team extended its fully functional web portal by adding many data types to its ThaleMine data mining tool and many tracks to its JBrowse browser. The team delivered web site infrastructure that, even in its prototype stage, allows community participants to develop and deploy their own web services and data integration applications.
We have also completed the most up-to-date and complete re-annotation of the Col-0 genome to produce Araport11 that consists of 37,523 genes (27,688 protein-coding, 5,051 non-coding, 952 pseudogenic, and 3901 transposable element-related loci) and 58,149 transcripts. The annotation contains 738 new protein-coding loci and a further 508 novel transcribed loci. In addition, we retired 388 genes that encoded short (hypothetical) proteins for which there was no database or RNA-seq support. Araport11 is available on the Araport project site (http://www.araport.org) and will also be released in GenBank by the time this report is published.JBrowse and ThaleMine continue to be central features of the portal’s user interface
JBrowse now hosts over 100 data tracks, including the latest gene models from Araport11 and their supporting evidence, as well as many community sourced tracks including 1001 genomes SNP data. Methods to allow community members to post and share their data through JBrowse using either GitHub or the CyVerse data store are in active development.
ThaleMine is a data warehouse which hosts and integrates a large collection of Arabidopsis genomics data including gene expression, orthologs, pathways, interactions, publications and others. We have continued to add new content and functionalities to ThaleMine. These include GeneRIFs, together with a portal to NCBI’s submission page that will allow community members to submit their own comments on gene function, and phenotype and stock data with links to ordering from ABRC and eNASC. The most recent addition is an RNA-seq-based expression module that allows users to view expression levels of their favorite genes across the 113 RNA-seq data sets used in the Araport11 re-annotation process.Science Apps, Web Services and Modules
The Araport framework allows registered users to install runnable code. Users may install Science Apps (client-side JavaScript programs that display content) or web services (server-side Python programs that deliver data) or community modules (combinations of Science Apps and web services). At this time, Araport hosts 21 Science Apps (listed on Table 1) and over 100 web services (including services contributed by SUBA and FLOR-ID). The Provart Lab is in the process of installing ePlant, an extensive module that integrates many visualization techniques and a diversity of data types, which will be Araport’s largest externally-developed module.
News
Despite its technical success and demonstrated ability to assimilate and integrate a wide range of data types, the site sees many fewer visitors than expected. Furthermore, although the attendees at the 2011 Design Workshop were enthusiastic about their vision of a federated data model with many community-contributed modules, their enthusiasm has so far not translated into the level of participation envisaged in the white paper. This is of concern to all of us, including our funders - the US National Science Foundation and the UK Biotechnology and Biological Sciences Research Council. As we develop a proposal for continued funding of the project, we will be pro-actively recruiting major data generators to the project to facilitate assimilation of their data into Araport and demonstrate the value of integration of multiple data types within the portal.
Conferences and Workshops
Project PIs attended the 25th ICAR in Paris in July 2015. In addition to a talk in a Plenary Session, there was a well-attended Araport workshop with contributions both from project PIs and from community contributors. We attended the ASPB meeting in Minneapolis, July 2015, presented a talk in the “Bioinformatics Resources for Plant Biology Research“ and also staffed a booth in the Exhibitor area together with colleagues from other resources. Project staff presented posters and/or talks at the Mid-Atlantic ASPB meeting (April, 2015), University of Maryland Mini-symposium (May 2015), the Mid-Atlantic Plant Molecular Biology Society Meeting (August 2015). Two team members spent one and a half days at Purdue University in November 2015 giving talks, a hands-on workshop and having one-on-one discussions with various faculty members. We organized the IAIC/Araport workshop at PAG, San Diego in January 2016 that included presentations from project personnel and community members.
Araport gave a talk at the ASPB mid-Atlantic Regional Meeting at Swarthmore in April 2016 and has been invited to give talks at ICAR 2016 in Korea in July and at the “GARNet2016: Innovation in the Plant Sciences” meeting in Wales in September 2016. We also plan a presence at the ASPB meeting in Austin in July 2016.
-
The Arabidopsis Information Resource (TAIR) Open or Close
Tanya Berardini (This email address is being protected from spambots. You need JavaScript enabled to view it.)
Leonore Reiser (This email address is being protected from spambots. You need JavaScript enabled to view it.)
Erica Bakker (This email address is being protected from spambots. You need JavaScript enabled to view it.)
Phoenix Bioinformatics, 39221 Paseo Padre Parkway, Ste J. Fremont, CA 94538
August 6th 2020
At TAIR, our view of the state of global Arabidopsis research is based on what we see through the lens of curating published literature. We import currently published ‘Arabidopsis’ papers as they are indexed by PubMed and use them to curate experimental gene function data. Each week, we load between 50 - 90 papers with the term ‘Arabidopsis’ in the title or abstract. We then review the abstracts and put the papers that seem to have functional information about Arabidopsis genes (around 41%), into our curation queue, prioritizing those with information about newly characterized genes.
As biocurators who seek to extract and organize data in meaningful ways, we share the following perspectives:
Overall, we see: 1) a steady number of papers that report on functions for previously characterized genes and 2) and an increase in the number of papers that describe high throughput experiments and contain large datasets. As the amount of papers and data increases, we at TAIR are developing strategies to increase throughput by incorporating more computation in the processing among other things. But there are things that authors can do to aid curation.Occasionally, our import process misses relevant, curatable articles, typically because the papers either fail to mention Arabidopsis as a species or because the unique locus identifiers (e.g. AT1G01010) are not included in accessible (text) format anywhere in the paper. The high throughput papers present another challenge to curation as frequently the gene lists are attached as supplementary tables that lack metadata or are in formats that are not easily parsed such as PDF, which limits their accessibility and reuse. Issues such as absence of (accessible) identifiers in manuscripts and proliferation of unstructured datasets, highlight the need for researchers to become familiar with FAIR (Findable, Accessible, Interoperable and Reusable; https://www.force11.org/group/fairgroup/fairprinciples) data principles to ensure that their published data is compliant, as well as the need for new and better tools to make it easier for researchers to make their data FAIR.
As a start, TAIR has generated a ‘cheat sheet’ (https://conf.arabidopsis.org/pages/viewpage.action?pageId=22807345) to help researchers learn how they can make their published data more FAIR.
Another observation is that many more papers that include Arabidopsis genes do so as a reference for a different primary organism being studied. These are papers we import but do not curate. In these papers, Arabidopsis genes are often used to predict functions based on homology or used to find knockout mutants for heterologous gene transformation experiments. These papers highlight the important role that Arabidopsis continues to play as a model organism as researchers branch out (pun intended) into other plant species. As more experimental data is generated in other species, there is also a corresponding need to codify what is learned about gene functions from other plant species (especially what differs from Arabidopsis or reflect biological systems unique to those species) in order to have a comprehensive understanding of plant gene functions. To address this need, we are in the process of developing a tool to enable researchers to curate functions for any gene from any organism.
Recent activities and newly developed tools and resources.
In the past year TAIR has made some significant operational and technical improvements, enhancements to the website and tools, and added data and resources to aid in understanding plant gene function.
Operational and Technical Changes
In May 2019 TAIR officially transitioned away from providing ordering capabilities for the ABRC. Researchers should now order stocks directly from the ABRC (https://abrc.osu.edu/). We continue to incorporate links from data pages (e.g. locus, gene, alleles, clones and germplasms) to the relevant stock centers (i.e. ABRC, NASC and RIKEN) to make finding resources easier for our community. Another significant change has been the addition of JBrowse to TAIR due to the loss of funding for the Araport (www.araport.org) project. TAIR, BAR and NCGR have joined together to ensure that the data and tools formerly provided by Araport remain available to the community. With help from members of the Araport and GMOD projects, TAIR has installed the latest version of JBrowse at TAIR (https://bit.ly/2Qhb5xC) starting with the tracks that were available at Araport. In the process, we have repaired tracks that were previously broken (e.g. Brassica Vista tracks) and also added new community data tracks for several published experiments (1,2). For those interested in making their sequence-based data public via JBrowse, TAIR welcomes your data submission. Please contact us at This email address is being protected from spambots. You need JavaScript enabled to view it..">This email address is being protected from spambots. You need JavaScript enabled to view it..
In addition to the above changes, we performed software updates and technical improvements. We updated TAIR’s BLAST service (https://www.arabidopsis.org/Blast/index.jsp) to the latest version of NCBI BLAST (2.9.0), and included all of the custom TAIR BLAST datasets. WU-BLAST was retired and a graphical display of alignments was added to the TAIR BLAST results display. Finally, we significantly speeded up page loading for the heavily accessed TAIR locus pages by making substantial changes to the underlying software.
PhyloGenes
In April 2019, Phoenix Bioinformatics, in collaboration with the Thomas Lab/PANTHER project at USC (www.pantherdb.org), launched PhyloGenes (www.phylogenes.org), a new web resource that facilitates inference of gene function based on phylogenetic relationships. PhyloGenes displays precomputed gene trees from PANTHER DB, alongside experimental gene function data or multiple sequence alignments. It makes use of the extensively curated information about gene function from Arabidopsis and 10 non-plant model species.
The most recent release (March 11, 2020; https://conf.arabidopsis.org/display/PHGSUP/Release+Notes) contains 40 plant species. By presenting a cohesive view of gene function in a phylogenetic context, PhyloGenes simplifies the process of assigning gene function to unknown genes. For species not included in the PhyloGenes build, users can graft protein sequences onto existing trees. TAIR now includes links out to PhyloGenes on the TAIR locus detail pages, in the Gene Families section, to view the corresponding families.
Arabidopsis MicroPublications
In October 2019, TAIR launched a partnership with the open access, peer-reviewed online journal microPublication (https://www.micropublication.org/). microPublication publishes brief, novel findings, negative and/or reproduced results, and results which may lack a broader scientific narrative. Micropublications are typically a single figure. TAIR curators review submitted works to ensure that, where possible, relevant data can be captured in TAIR. Types of data we curate from micropublications include gene functions, mutant phenotypes and expression data. Each paper is assigned a DOI and is citable. microPublications fulfill a need for a mechanism to share data that not otherwise be published such as student work from course based undergraduate research experiences. If you have questions about microPublications or are interested in serving a reviewer, send email to: This email address is being protected from spambots. You need JavaScript enabled to view it..">This email address is being protected from spambots. You need JavaScript enabled to view it..
Gene function curationTAIR curators continue to extract experimental gene function data from the current literature and codify the data in the form of annotations to Gene Ontology and Plant Ontology terms as well as curated gene summaries, alleles and phenotypes, and gene symbols. Along with curating recent literature, we have begun making a concentrated effort to identify and fill in gaps about missing gene function where possible. In 2019 we began by identifying sets of genes for which there were no GO annotations at all, reviewing our linked literature, and adding annotations where possible. The ‘unknown’ list is publicly available (https://conf.arabidopsis.org/pages/viewpage.action?pageId=22807120) and we encourage the community to contribute data if they have functional information for any of these genes. We continue to produce quarterly updates of current data for subscribers (https://www.arabidopsis.org/download/index-auto.jsp?dir=%2Fdownload_files%2FSubscriber_Data_Releases), and year old data for use by all (https://www.arabidopsis.org/download/index-auto.jsp?dir=/download_files/Public_Data_Releases). As always, we are grateful to our subscribers and data submitters for ensuring that this important resource continues to be available and up to date.
Planned future activities
Aside from continuing our normal curation activities, a major goal for the coming year is to update the TAIR software stack. We plan to significantly overhaul the backend systems and replace most of the older codebase with more modern technology. This ground up redesign will enable greater flexibility, scalability, and more responsive, configurable web pages.
Conferences, Workshops and Training events
1- PAG 2020: Phoenix Bioinformatics staff organized several workshops for PAG2020 (Arabidopsis Informatics, Database Sustainability) and will continue to do so for 2021. We presented an update on TAIR during the Arabidopsis Informatics session.
2- ASPB 2019: Phoenix/TAIR co-organized the Plant Bioinformatics workshop at ASPB2019 in San Jose and presented a talk about PhyloGenes. We also presented a talk on the state of functional gene annotation at TAIR with an emphasis on what remains to be known.
3- ICAR 2019: TAIR organized the Bioinformatics resources workshop at ICAR2019 in Wuhan China and presented an update on TAIR and PhyloGenes.
Slides from TAIR presentations are available on the Phoenix Bioinformatics SlideShare (https://www.slideshare.net/PhoenixBio). We also maintain a social media presence on Twitter (@tair_news) and Facebook (https://www.facebook.com/tairnews/).
Lee TA, Bailey-Serres J. Integrative Analysis from the Epigenome to Translatome Uncovers Patterns of Dominant Nuclear Regulation during Transient Stress. Plant Cell. 2019;31(11):2573–2595. doi:10.1105/tpc.19.00463
Thieffry, A., Bornholdt, J., Ivanov,M., Peter Brodersen, P., Sandelin, A. Characterization of Arabidopsis thaliana promoter bidirectionality and antisense RNAs by depletion of nuclear RNA decay enzymes
bioRxiv 809194; doi: https://doi.org/10.1101/809194
Plant Projects and Resources with Strong Participation of the Arabidopsis Community
-
Bio-Analytic Resource for Plant Biology (BAR) Open or Close
By Nicholas Provart (Director), http://bar.utoronto.ca.
Open Tools and Resources for Arabidopsis Researchers
The Bio-Analytic Resource is a collection of user-friendly web-based tools for working with functional genomics and other data for hypothesis generation and confirmation. Most are designed with the plant (mainly Arabidopsis) researcher in mind. Data sets include:
* 150 million gene expression measurements (75 million from A.th.), plus “expressologs” (homologs showing similar patterns of expression in equivalent tissues) for many genes across 12 species. View expression patterns with our popular eFP Browser or newer ePlant tool.
* 70,944 predicted protein-protein interactions plus 62,626 experimentally determined PPIs (rice interologs also available!) and ~2.8 million protein-DNA interactions, which can be explored with our new Arabidopsis Interactions Viewer 2 tool.
* 29,180 predicted protein tertiary structures and experimentally determined structures for 402 Arabidopsis proteins.
* Millions of non-synonymous SNPs from the 1001 Arabidopsis Genomes project, now delivered via the 1001 Genomes API.
* Documented subcellular localizations for 11.7k proteins, predicted localization for most of Arabidopsis proteome, from the SUBA database at the University of Western Australia.Recent activities
The news from December 2018 that Araport would not continue to be funded by the National Science Foundation in the U.S. precipitated a meeting in March, 2019 at which it was agreed that the BAR and TAIR would resuscitate the existing Thalemine and JBrowse instances, respectively, and that Araport’s previous functionality would be expanded by adding a Genome Context Viewer from the National Center for Genome Resources to enable the viewing of multiple fully assembled Arabidopsis thaliana genomes. The Bio-Analytic Resource rolled out a revived and updated version of Araport’s Thalemine at https://bar.utoronto.ca/thalemine/ in December 2019.
The BAR also published its eFP-Seq Browser at https://bar.utoronto.ca/eFP-Seq_Browser/ for exploring RNA-seq data as both read map profiles and summarized gene expression levels across two large compendia, in order to be able to quickly identify samples with the highest level of expression or where alternative splicing might be occurring (Sullivan et al., 2019).
The BAR was also happy to announce its first single cell RNA-seq eFP Browser view at http://bar.utoronto.ca/efp/cgi-bin/efpWeb.cgi?dataSource=Single_Cell, based on root single cell data RNA-seq from John Schiefelbein’s lab at the University of Michigan (Ryu et al., 2019). We also added a DNA damage RNA-seq data set by Bourbousse et al. (2018) to the eFP Browser at http://bar.utoronto.ca/efp/cgi-bin/efpWeb.cgi?dataSource=DNA_Damage, which is one of seven RNA-seq-based eFP Browser views, coming soon to ePlant.
We enabled links for 2.8 M Ecker Lab DAP-seq PDIs in our AIV2 tool (http://bar.utoronto.ca/interactions2/) to corresponding peak data in the Ecker Lab’s AnnoJ Browser. The Gazzarrini and Lumba Labs at the University of Toronto (Carianopol et al., 2020) identified 125 SnRK1 complex interacting proteins using a meso-scale Y2H screening approach against ABA-regulated gene products and we’ve added these, along with hundreds of other PPIs published in the past year, into the AIV2 tool and database.
For non-Arabidopsis plant researchers, May the 4th 2019 was with you! ePlants for 15 agronomically important species became available on the BAR homepage at http://bar.utoronto.ca. We will be growing these in the future by adding more data, and we welcome comments and ideas for new data sets for them.
For maize researchers, new BAR eFP images and links to an updated gene atlas were enabled in MaizeGDB based on data from the Buell Lab (Hoopes et al., 2019).
We collaborated to create a new eFP view for Thomas Widiez and colleagues microdissection work to generate RNA-seq data from an early timepoint in maize seed development (Doll et al., 2020): http://bar.utoronto.ca/efp_maize/cgi-bin/efpWeb.cgi?dataSource=Maize_Kernel.The Mutwil Lab in Singapore published a Selaginella moellendorffii expression atlas (Ferrari et al., 2020). We collaborated to create an eFP Browser for it: http://bar.utoronto.ca/efp_selaginella/cgi-bin/efpWeb.cgi. Jin Zhang, Xiaohan Yang and colleagues published a nice paper on how light quality and intensity modulates the transcriptome in Kalanchoe (Zhang et al., 2020). We collaborated to create our first CAM plant eFP Browser, see http://bar.utoronto.ca/efp_kalanchoe/cgi-bin/efpWeb.cgi.
BAR Publications (plus 2 citations* mentioned above, not in collaboration with the BAR)
Bourbousse, C., Vegesna, N., and Law, J.A. (2018). SOG1 activator and MYB3R repressors regulate a complex DNA damage network in Arabidopsis. Proc. Natl. Acad. Sci. U. S. A. 115: E12453–E12462*.
Carianopol, C.S., Chan, A.L., Dong, S., Provart, N.J., Lumba, S., and Gazzarrini, S. (2020). An abscisic acid-responsive protein interaction network for sucrose non-fermenting related kinase1 in abiotic stress response. Commun. Biol. 3: 145.
Doll, N.M., Just, J., Brunaud, V., Caïus, J., Grimault, A., Depège-Fargeix, N., Esteban, E., Pasha, A., Provart, N.J., Ingram, G.C., Rogowsky, P.M., and Widiez, T. (2020). Transcriptomics at Maize Embryo/Endosperm Interfaces Identifies a Transcriptionally Distinct Endosperm Subdomain Adjacent to the Embryo Scutellum. Plant Cell 32: 833.
Ferrari, C., Shivhare, D., Hansen, B.O., Pasha, A., Esteban, E., Provart, N.J., Kragler, F., Fernie, A., Tohge, T., and Mutwil, M. (2020). Expression Atlas of Selaginella moellendorffii Provides Insights into the Evolution of Vasculature, Secondary Metabolism, and Roots. Plant Cell 32: 853.
Hoopes, G.M., Hamilton, J.P., Wood, J.C., Esteban, E., Pasha, A., Vaillancourt, B., Provart, N.J., and Buell, C.R. (2019). An updated gene atlas for maize reveals organ-specific and stress-induced genes. Plant J. 97: 1154–1167.
Ryu, K.H., Huang, L., Kang, H.M., and Schiefelbein, J. (2019). Single-Cell RNA Sequencing Resolves Molecular Relationships Among Individual Plant Cells. Plant Physiol. 179: 1444*.
Sullivan, A. et al. (2019). An ‘eFP-Seq Browser’ for visualizing and exploring RNA sequencing data. Plant J. 100: 641–654.
Zhang, J. et al. (2020). Light-responsive expression atlas reveals the effects of light quality and intensity in Kalanchoë fedtschenkoi, a plant with crassulacean acid metabolism. GigaScience 9.
Planned future activities
A custom eFP view in ePlant for a researcher’s own RNA-seq data is in the works, along with “Gaia” (kind of like Siri or Alexa, but for Arabidopsis information) as part of a new award from Genome Canada. Several new ePlants are also planned as part of this project, and an ecosystem viewer will also be developed.Conferences, Workshops and Training events
The BAR participated in the 2019 American Society of Plant Biology (ASPB) Plant Biology conference in San Jose, as part of the Plant AgData Outreach booth and in the Plant Bioinformatics workshop; and Plant and Animal Genomes (PAG) XXVIII at the start of 2020 in San Diego, California.
The BAR principal investigator Nicholas Provart released a “Plant Bioinformatic Methods” specialization in plant bioinformatics, encompassing 4 courses on Coursera.org, see https://www.coursera.org/specializations/plant-bioinformatic-methods. These may be audited for free, or you can get a certificate for a small fee.
-
BrassiBase Open or Close
By Marcus A. Koch (director), http://brassibase.cos.uni-heidelberg.de/.
BrassiBase is continously developed into a comprehensive Brassicaceae-knowledge-database system. During 2015/2016 a first family-wide species check-list has been created. In total, more than 15,000 taxonomic entities (“names” of species, subspecies, etc., including synonyms) have been collected, checked and cross-referenced. We are now in the process to use this most actual and accurate species check-list as “backbone” for BrassiBase and link given information whenever possible to this information.
Furthermore, morphological descriptions of characters of any genus are now finalized and implemented into an interactive key to the genera. We hope that this will help to identify cultivated and/or collected material more easily, particularly if used in combination with the “Phylogenetic placement tool” implemented with BassiBase.
We intend to release the third version of BrassiBase during 2016 and we invite and encourage the Arabidopsis community to register with BrassiBase (it’s free) and help improving the system - by reporting and contributing with results and data and/or spotting problems and making suggestions for future releases.
Conferences and Workshops
BrassiBase workshop held in Heidelberg in October 2015.
-
CyVerse Open or Close
By Parker Antin (principal investigator), Eric Lyons (co-principal investigator), Nirav Merchant (co-principal investigator), Matthew Vaughn (co-principal investigator) and Doreen Ware (co-principal investigator), http://www.cyverse.org/.
CyVerse is one of eight projects funded by the National Science Foundation (NSF) Directorate for Biological Sciences. CyVerse is a dynamic virtual organization led by the University of Arizona to fulfill a broad mission to design, deploy, and expand a national cyberinfrastructure for life sciences research and train scientists in its use. CyVerse partner institutions each contribute an important component to the endeavor: Texas Advanced Computing Center, Cold Spring Harbor Laboratory, and the University of North Carolina at Wilmington.
Developing the Science of the Future
CyVerse fills a niche created by the computing epoch and a rapidly evolving world. Developing solutions to today’s grand scientific challenges means that we must understand how the organisms that contribute to our food, fuels, and ecosystem are shaped by interactions with their environment. CyVerse provides life scientists with powerful computational infrastructure to handle huge datasets and complex analyses, thus enabling data-driven discovery. CyVerse provides access to a comprehensive and cohesive suite of computational resources supporting data management, cloud computing, high-performance computing, high-throughput computing, identity management, and collaboration tools, all built from open source components. CyVerse resources are accessible using multiple methods, including web-accessible applications, command-line-based access and well-described Application Programming Interfaces (APIs) for ease of automation and performing scalable data analysis. The powerful extensible platforms provide data storage, bioinformatics tools, image analyses, cloud services, and more. Answering the need of an era of data science, CyVerse makes broadly applicable computational resources available across the life sciences.
Engaging the Data Science Community
CyVerse was launched in 2008 as the iPlant Collaborative, aiming to serve the plant science research community. From its inception, iPlant quickly grew into a mature organization providing powerful resources and offering scientific and technical support services to researchers nationally and internationally. Now rebranded to CyVerse, the project has expanded the mandate to provide CI support across the life sciences. CyVerse CI architecture and implementation is agnostic with regards to scientific domain and supports many different life science disciplines and their associated data types and analyses. CyVerse allows researchers to analyze their growing datasets more efficiently, with greater flexibility, and to address previously difficult or impossible questions. Together, CyVerse CI permits researchers to deposit and share new data, programmers to easily deploy new tools and analytical workflows, and researchers of all skill levels to easily use and reuse those data and tools. CyVerse has created a robust, widely used, and evolving CI that is profoundly impacting life sciences and bioinformatics. CyVerse also provides training, learning material, and best practice resources to help all researchers make the best use of their data, expand their computational skill set, and effectively manage their data and computation when working as distributed teams.
Creating Global Collaborations
CyVerse envisions a future where all biologists have access to, are able to use, and know how to extend CI to solve problems and advance scientific discovery in research and apply CI to education. Through partnerships and direct engagement, CyVerse has helped accelerate the pace of science for many labs and individual researchers by offering computational and data management solutions that meet the demands of modern scientific technologies. Going forward, CyVerse aims to promote computational thinking and empower researchers to new scientific discoveries by enabling global collaborations in data sharing, management, analysis, and visualization.
-
International Plant Phenotyping Networks Open or Close
International Plant Phenotyping Network
http://www.plant-phenotyping.org/
Philipp von Gillhaussen,
This email address is being protected from spambots. You need JavaScript enabled to view it.
Operations Manager at IPPN
Julich Plant Phenotyping Centre
International Plant Phenotyping Network is a non-profit association (since 2015) representing the major plant phenotyping centers & practitioners within industry & academia, internationally.IPPN aims to provide all relevant information about plant phenotyping. The goal is to increase the visibility and impact of plant phenotyping and enable cooperation by fostering communication between stakeholders in academia, industry, government, and the general public. Through workshops and symposia, IPPN established different working groups which advance specific fields of plant phenotyping research & applications and distribute all relevant information about plant phenotyping in a web-based platform (www.plant-phenotyping.org).
The purpose of the IPPN is to promote science, research and further applications in the field of plant phenotyping, focusing particularly on these goals:
- establishing a global network of institutions in order to maximize existing synergies, identify and reduce potential bottlenecks via joint projects and within topic-specific working groups;
- fostering communication and cooperation between different stakeholders from academia, industry, and the general public;
- increasing the visibility and impact of plant phenotyping beyond its own research community;
- facilitating the interdisciplinary training needed for effective plant phenotyping research and education.Additionally, IPPN is eager to identify and initiate research projects, particularly to establish plant phenotyping in new regions and to implement advanced training activities.
Recent activities and newly developed tools and resources.End 2019/ beginning of 2020, the IPPN general assembly elected a new executive board, leading the organization for -at least- the next four years. For the first time in the history of the organization members from academia and also from industry are represented in this board.
Among the new board members, internationally renowned institutions are listed, like Forschungszentrum Juelich (Germany), Wageningen University & Research (Netherlands), Purdue University (USA) and University Nebraska-Lincoln (USA). On the Industry side, sensor developers & system integrators PhenoVation (Netherlands) and Phenotrait (China) are board members.
Planned future activities
After the last International Plant Phenotyping Symposium (IPPS) in China (Autumn 2019) and according to the organization’s internal roadmap, the next IPPS will be held in Europe in 2021. Internal voting for where exactly the venue will be is ongoing until April 2020. Preparations will start immediately afterwards.
Conferences, Workshops and Training events
Several sessions, workshops, hackathons and satellite meetings organized by IPPN’s working groups within several international plant science conferences (IPAP, Rooting2020, SEB) have recently been postponed due to the outbreak of the Corona virus pandemic.
To stay informed on these, please visit IPPN’s event section
https://www.plant-phenotyping.org/index.php?index=580EMPHASIS
https://emphasis.plant-phenotyping.eu/
Roland Pieruschka
This email address is being protected from spambots. You need JavaScript enabled to view it.
Julich Plant Phenotyping Centre
The European Infrastructure for Multi-scale Plant Phenomics and Simulation (EMPHASIS) is a distributed Research Infrastructure to develop and provide access to facilities and services addressing multi-scale plant phenotyping in different agro-climatic scenarios. EMPHASIS will establish an integrated European phenotyping infrastructure to analyse genotype performance under diverse environmental conditions and quantify the diversity of traits contributing to performance in diverse environmental conditions − plant architecture, major physiological functions and output, yield components and quality. EMPHASIS aims to address the technological and organizational limits of European Phenotyping, for a full exploitation of genetic and genomic resources available for crop improvement in changing climate. Inserted in the ESFRI Roadmap in 2016, EMPHASIS is in the transition from the Preparatory Phase (2017-2020) to the Implementation Phase (2020-2021) and is supposed to become operational in 2022.Recent activities and newly developed tools and resources.
EMPHASIS is in the process of developing its business plan in close discussion with national ministries with the final goal to enable a long term operation. Currently, ministry representatives from 14 countries have expressed their interest in supporting the implementation of EMPHASIS and a discussion was initiated to develop an Interim General Assembly as the discussion making body to develop and implement EMPHASIS and an organisation providing services to the plant science community in Europe.
Additionally, based on the landscaping analysis, which included a close interaction with Arabidopsis community, EMPHASIS started to pilot specific highly demanded services to test and illustrate the potential to generate benefits, the return of investment as well as test feasibility of these services. The pilot services include topics related to: i) access to field sites, ii) harmonisation, iii) data management, iv) harnessing innovation, v) modelling.
EMPHASIS is also active beyond plant sciences, utilizing synergies with other research infrastructures in projects such as i) CORBEL linking biological and medical infrastructures, ii) ENVRIPlus bringing together environmental and earth system research infrastructures, iii) EOSC-Life developing digital life sciences, iv) RI-VIS aiming at increasing the global visibility of European research infrastructures.
Conferences, Workshops and Training events
EMPHASIS will organize annual European Plant Phenotyping conferences starting in 2021. The first conference will be co-organized with the International Plant Phenotyping Network.
European Plant Phenotyping Network 2020https://eppn2020.plant-phenotyping.eu/
Roland Pieruschka
This email address is being protected from spambots. You need JavaScript enabled to view it.
Julich Plant Phenotyping Centre
The EPPN2020 is a H2020 funded research infrastructure project (Grant Agreement: 731013) that provides European public and private scientific sectors with access to a wide range of state-of-the-art plant phenotyping facilities, techniques and methods, with the aim to support the exploitation of genetic and genomic resources available for crop improvement that represents a major scientific challenge for coming decades. EPPN2020 specifically aims to facilitate the community progress across the whole phenotyping pipeline, involving sensors and imaging techniques, data analysis in relation to environmental conditions, data organization and storage, data interpretation in a biological context and meta-analyses of experiments carried out on different organs at different scales of plant organizationRecent activities and newly developed tools and resources.
EPPN2020 has recently finalized a fifth call for transnational access enabling over 120 experiments within innovative plant phenotyping facilities in Europe. A substantial amount of the 31 facilities providing access within EPPN2020 focuses on Arabidopsis research, the experiments include deep phenotyping for high precision screening of specific traits to high throughput screening of large populations.
Planned future activities.
EPPN2020 will announce the final sixth call for application for access to EPPN2020 facilities in April 2020. Selected users can do the proposed experiments for free, including travel and accommodation. In total, we expect to be able to facilitate over 150 experiments.
Conferences, Workshops and Training events
EPPN2020 will host a dedicated session on related to benchmarking EPPN2020 transnational Access experiments during the SEB2021 in Antwerp.
-
Gramene: A comparative resource for plants Open or Close
By Marcela Karey Tello-Ruiz (Project Manager) and Doreen Ware (PI), http://www.gramene.org/. Download 2020-2021 Report
August 6th 2020Recent activities and newly developed tools and resources
Gramene provides open access to plant genomes, gene functional annotations including curated and projected metabolic and regulatory pathways, and gene expression data in a phylogenetic context, based on robust phylogenetic genes trees. In collaboration with EnsemblGenomes, Gramene hosts 67 plant reference genomes (about 2.5 million genes in total) including three Arabidopsis species (almost 100,000 genes): A. thaliana, A. lyrata, and A. halleri. For each reference genome sequence, we provide structural and functional gene annotations including ontology associations and protein domain assignment, genetic and structural variants, phylogenetic trees with orthologous and paralogous gene classification, whole-genome alignments, and synteny maps.
Our phylogenetic trees include 96,607 gene families comprising over 2.1 million genes or almost 2.4 million input proteins supporting homolog and ortholog assignments to Arabidopsis species. Functional and structural information is provided for each family tree in visually informative (e.g., color-coded protein domains and tick marks indicating splice junctions) and interactive (e.g., ability to select a specific GO term or InterPro domain) views to highlight homologs that share functional features.
A. thaliana serves as an anchor species within Gramene. A. thaliana homologs are displayed as part of the query results within the Gramene search and A. thaliana is used as the dicot model for pairwise whole-genome alignments collection. Within the past year, the alignments subset for A. thaliana grew from 57 to 66, including alignments between A. thaliana and each of A. lyrata and A. halleri. In addition, we host alignments between A. lyrata and each of Medicago truncatula, Oryza sativa (Japonica rice), Theobroma cacao (cacao), and Vitis vinifera (grapevine); and A. halleri to Japonica rice, cacao, and grapevine. Our synteny collection includes synteny maps for A. thaliana and each the following five species: A. lyrata, Brassica rapa, Japonica rice, cacao, and grape; and for A. lyrata and grapevine. We continue to host 12.9 million Arabidopsis SNPs from the 1001 Arabidopsis Genomes Project. Variants are provided in the context of gene annotation, gene regulation, and protein domain structure, along with predicted functional consequences (e.g. missense variant), and genotypes.
In our continued collaboration with the Expression Atlas project (EMBL-EBI), we provide baseline expression data for 24 species, including A. thaliana and A. lyrata through both, our Gramene Ensembl genome browser and Plant Reactome pathways interfaces. In addition, we provide direct links to differential gene expression data on the EMBL-EBI Expression Atlas website for a partially overlapping set of 24 species, including A. thaliana and A. lyrata. More recently, EBI Atlas, developed the capacity to host single-cell gene expression data; currently five data sets from four studies are available (Ryu et al, 2019; Jean-Baptiste et al, 2019; Shulse et al, 2019; Turco et al, 2019).
In collaboration with Reactome, Gramene hosts 306 metabolic and regulatory pathways curated in Japonica rice and inferred in 96 additional plant species (including the three Arabidopsis species) based on orthology. Reactome pathways are checked and peer-reviewed prior to publication to ensure factual accuracy and compliance with the data model, and a system of evidence tracking ensures that all assertions (which use community standard controlled vocabulary ontologies) are supported by primary literature. Gramene’s integrated search capabilities, and interactive views facilitate visualizing gene features, gene neighborhoods, phylogenetic trees, gene expression profiles, and pathways. The homology view in the search interface allows custom pruning of the gene trees to selected species of interest, and visualizing sequence conservation to the amino acid level. The views also assist cross-referencing to other bioinformatics resources, including AraPort, TAIR and NASC for Arabidopsis.
Gramene provides tools to support integration of user data sets, in context to the reference data. These tools include a sequence assembly converter (which allows the conversion of genomic coordinates between the TAIR9 and TAIR10 genome assemblies), a genetic variant effect predictor, an advanced BioMart-based query interface, data analysis and visualization of OMICS data, multi-species pathway comparisons, and BLAST/BLAT sequence aligner. Together these reference comparative genome data and tools enable powerful cross-species comparisons among plants and reference eukaryotic species.
Gramene data sets that include Arabidopsis species:• Structural and functional annotations for 2.2 million gene models in 67 plant reference genomes including three Arabidopsis model species -A. thaliana, A. lyrata, and A. helleri, cereal, vegetable, and fruit crops (e.g., Brassicas, Fabaceas, Solanaceas), basal plants and algae.
• 96,607 phylogenetic tree families (built with 67 plant and 5 non-plant species), 299 whole-genome alignments (66 with Arabidopsis species), and 66 synteny maps (6 with Arabidopsis sp.).
• About 224 million genetic and structural variants for 11 plant species, including 12.9 million Arabidopsis SNPs from the 1001 Arabidopsis Genomes Project. The Arabidopsis SNP set includes genotypes for over 1,000 accessions, and was combined with phenotypic data (107 phenotypes associated with 95 inbred lines) from the GWAS study by Atwell et al (2010).
• Experimental baseline and differential expression data for 827 experiments in 24 plant species, including A. thaliana and A. lyrata.
• 306 reference metabolic and regulatory pathways curated in rice and inferred in 96 additional plant species (including the three Arabidopsis species in Gramene).
• Integrated search capabilities and interactive views to query and visualize gene features, gene neighborhoods, phylogenetic trees, gene expression profiles, pathways, and cross-references to other bioinformatics resources (e.g., AraPort, TAIR, and NASC).
• Analysis tools to support comparative analyses of our data as well as user-provided data (e.g., BLAST/BLAT sequence aligner, sequence assembly converter for TAIR9/TAIR10 genomic coordinates, genetic variant effect predictor, BioMart, Reactome pathways analysis/visualization of OMICS data and multi-species pathway comparisons).
Gramene is committed to open access and reproducible science based on the FAIR (Fair, Accessible, Interoperable and Reusable) data principles. We are a phylogenomic resource, built upon best-of-class open source software, Ensembl, Reactome, and Expression Atlas infrastructure platforms.
Gramene has developed a powerful and flexible document-based architecture that enables advanced searching via a web-service accessible by a variety of programming languages; each platform supporting web-based and programmatic access through application programming interfaces (APIs).
Extensive use of ontologies, database cross-references, common data formats, metadata, community engagement and open-source software promotes interoperability within the ecosystem of informatics data and services. Gramene’s genome portal utilizes the Ensembl infrastructure and is developed in collaboration with the Ensembl Genomes project (EMBL-EBI); the pathway portal, Plant Reactome (http://plantreactome.gramene.org) utilizes the Reactome infrastructure, and is developed in collaboration with OCIR; the baseline expression data from both, our genomes and pathway browsers, is a collaboration with the Expression Atlas project (EMBL-EBI). Integration across these platforms in Gramene is supported by an NSF grant IOS-1127112, and partially from USDA-ARS (8062-21000-041-00D).
Jean-Baptiste K, McFaline-Figueroa JL, Alexandre CM, Dorrity MW, Saunders L et al. (2019) Dynamics of Gene Expression in Single Root Cells of Arabidopsis thaliana. Plant Cell. 31(5):993-1011. doi: 10.1105/tpc.18.00785.
Ryu KH, Huang L, Kang HM, Schiefelbein J (2019). Single-Cell RNA Sequencing Resolves Molecular Relationships Among Individual Plant Cells. Plant Physiol. 179(4):1444-1456. doi: 10.1104/pp.18.01482.
Shulse CN, Cole BJ, Ciobanu D, Lin J, Yoshinaga Y et al. (2019) High-Throughput Single-Cell Transcriptome Profiling of Plant Cell Types. Cell Rep. 27(7):2241-2247.e4. doi: 10.1016/j.celrep.2019.04.054.
Turco GM, Rodriguez-Medina J, Siebert S, Han D, Valderrama-Gómez MÁ et al. (2019) Molecular Mechanisms Driving Switch Behavior in Xylem Cell Differentiation. Cell Rep. 28(2):342-351.e4. doi: 10.1016/j.celrep.2019.06.041.
Planned future activities.
With future support, we will continue to maintain and build the Gramene resource, with aims to have a minimum of two releases: 1) update and expand our reference data collection of plant genomes, genetic variation, gene expression, and standardized comparative annotations, 2) enrich our Plant Reactome pathways data resource with newly curated pathways, and orthology-based projections, 3) improve the functionality of the Gramene search interface and integrate DIVE (Gupta et al, 2018) gene functional information, and 4) transform the community through communication and training opportunities.
Gupta A, Xu W,Jaiswal P, Taylor C, and Regala J (2018). Domain Informational Vocabulary Extraction Experiences with Publication Pipeline Integration and Ontology Curation. http://ceur-ws.org/Vol-2285/ICBO_2018_paper_43.pdfConferences, Workshops and Training events
In the past year, Gramene participated in 14 scientific conferences: 61st annual Maize Genetics Conference; 2019 Biocuration; 2019 CSHL Systems Biology and Engineering; 2019 CSHL Biology of Genomes; 2019 ASPB Plant Biology; 2019 International Conference in Arabidopsis Research (ICAR); 2019 CSHL Genome Informatics; 2019 Welcome Trust Plant Genomes and a Changing Environment; 3rd International Conference on Plant Synthetic Biology, Bioengineering, and Biotechnology; 2019 CSHL Plant Genomes and Biotechnology; 2019 & 2020 EU COST Integrape; 2020 Genomes Working Group CA17111; and XXVIII Plant and Animal Genomes (PAG).
We organized community outreach booths for members of the AgBioData Consortium at the PAG and ASPB conferences. We continued to host educational webinars in Gramene’s YouTube channel
https://goo.gl/ln9RLD.We organized one pathway curation jamborees and one virtual gene structural annotation jamboree for faculty (predominantly from primarily undergraduate institutions or PUIs), graduate students, and maize researchers.
For the later, we hosted four 3-hour long live webinars, and made the video-recordings, as well as our training material available in a google drive (shorturl.at/oBHVZ). Other plant education activities geared to K-12 students and faculty included hands-on activities to celebrate Fascination of Plants Day at Bayville Elementary School, Science Night and a DNA workshop for Science Olympiad participants at Franklin K-8 School, Saturday DNA at CSHL DNA Learning Center, panels on Ethics in Genetic Engineering at SUNY Stony Brook, and Food and Climate at CSHL.
In the next year, we plan on continuing our outreach, education and training activities, including attending the PAG, ASPB and CSHL Biology of Genomes meetings.
Besides the above listed projects and resources, there are many other international and multinational initiatives with major contributions from Arabidopsis researchers, e.g. the 1001 genomes Project (http://www.1001genomes.org/), the Epigenomics of Plants International Consortium (EPIC; http://www.plant-epigenome.org/), the Plant and Microbial Metabolomics Resource (http://metnetdb.org/PMR/) and the International Plant Phenotyping Network (http://www.plant-phenotyping.org/).