Data Abyss 101 - Search Options

Within Data Abyss consists of various search options from Web to Awards. This can be quite intimidating for users that are just starting out and trying to get familar with Data Abyss and what all it has to offer. The purpose of this document is to describe each of these search options!

The Data Abyss search interface and The Rivalry Platform

I call the larger infrastructure behind the Data Abyss search capabilities, The Rivalry Platform. The Rivalry Platform can be thought of as everything that functions behind the scenes to make Data Abyss operate, crawlers and collectors, VPNs, Proxies, processing capabilities, and natural language processing.

Web - Web is a full text searchable capability that takes thousands of chinese journals and various S&T literature and processes them to make them full text searchable. The default search is across full text and users can use the Advanced Search function to fuzzy or exact search the contents of these documents. You can also add AND OR statements in the Advanced Search if users are looking for very specific articles.
The results can then be selected and analyzed or read. Since I exact the text and represent it in html on the page, users can use Google Translate to read the articles if need be.
We recommend using google chrome when using Data Abyss, which has google translate built in.

Images - Images produces the exact same results as Web just offering another way of visualizing the results. Images! Images are the images imbedded in the Web’s contents. Images can be selected and analyzed to view additional images from the articles. The arrow at the top right of the selected image will take the user to that specific article in Web.

Publications - Publications are a collection of China National Knowledge Infrastructure (CNKI) based journal article metadata such as titles, abstracts, authors, affiliations, keywords, funding and more.


Organizations -
Organizations are derived from the affiliations of the Publications data. This search consists of millions of international Organizations derived from the chinese journals. Users can search short hand such as State Key and get lists of the Chinese State Key Laboratories in China to be further analyzed.

People - People is a collection of global S&T researchers, scientists, and more. It’s not limited to China alone. Its a global author search. The advanced search allows users to search by the individuals institution or institution continent and more.


Foreign Talent -
Foreign Talent is derived from corresponding author data from Publications. These are top professors and scientists on chinese articles to be the point of contact on the research. Corresponding authors leave email addresses on the journal articles and we use this data point as our foundation for this search option.

Technologies - Technologies are keywords derived from Publications. These keywords can be english or chinese so its best to search both variations. The results from a Technologies search are related keywords from the journal articles. These keywords can be selected to analyze the technology further and review other additional data points on that technology.

Rare Technologies - Rare Technologies is a lot like Technologies but uses a much different search capability. The search here uses the same data as the Technologies search but uses a “multi-bucket value source based aggregation which finds "rare" terms — terms that are at the long-tail of the distribution and are not frequent.” In other words, it uses a statistical measure to identify the LEAST common technologies.

Awards - Awards are derived from the funding metadata in Publications. Awards are the accounting or project numbers of the projects conducting and funding the research. These are usually a series of letters and numbers such as 2018M643869. Here is an example page for this award number: https://app.dataabyss.ai/award/2018M643869
Users can search anything from a keyword to an organization to help them identify Awards that might be of particular interest to analyze or investigate further.

Previous
Previous

Introduction to Mandarin Matrix