Open communities have been playing a critical role in Computer Science and Software Engineering disciplines in driving large-scale developer adoption, rapid innovation and advancement of platforms and technologies. Each subfield in these disciplines has large, and vibrant open community initiatives. One such open community is Open AI, in the field of Artificial Intelligence (AI). As a professional society striving to advance service innovation by connecting students, academics with industry partners ISSIP is keenly interested in Open AI initiatives. Today, ISSIP is proud to launch two initiatives in Open AI area: (i) creating an open AI dataset cataloging initiative and (ii) developing an open AI Services evaluation framework. These two projects are described briefly below.
Open AI datasets cataloging project: Finding suitable data to train and experiment with AI services is a huge task for students and companies alike and is one of the biggest pain points in advancing the development of AI applications. Most AI projects need good datasets to experiment with for building AI services. ISSIP noted this as a gap in the community’s ability to experiment with AI services. One has to do many internet searches to find the right test datasets to train and test the available AI services. There are dataset aggregators like Kaggle. However, there are many specialized datasets offered by others that aren’t well known or cataloged even by sites like Kaggle. To address this gap, we started an open AI datasets cataloging project. In ISSIP’s open AI datasets catalog, we have pointers to hundreds of AI datasets and these are searchable with keywords. We also made the dataset submission process open, thereby enabling industry, student and academic communities to contribute pointers to their datasets to this catalog. This searchable open AI dataset catalog is now open and available on the ISSIP website. http://www.issip.org/open-data-sets/?. Please feel free to use the links in this catalog to access the datasets you need for your AI projects, send us your feedback and help us keep it fresh, improve it and enrich it by submitting additional links to datasets using the submission link. A screenshot of the searchable open AI data sets is shown below.
Figure 1: A screenshot of searchable Open AI datasets catalog.
AI Services testing and evaluation framework: Many companies and universities are building and offering AI building block services such as speech-to-text, and text-to-speech, translation, natural language understanding services such as identifying entities, concepts, keywords, sentiment, emotions, relations and semantic roles in a given piece of text and machine translation services to convert text from one language to another etc. Students, academics and customers need a good evaluation framework for testing these AI services on open and fair datasets so that they can evaluate their performance and make appropriate decisions on which vendor/university’s services they may want to use in their projects. Presently, there is no such open testing evaluation framework for testing AI services offered by multiple parties. A good testing and evaluation framework should also select a suitable, open and fair dataset for testing everyone’s services. Noting these requirements, ISSIP initiated and has spearheaded a project in ISSIP with volunteer engineers to develop and release a testing and evaluation framework for AI Services. The testing framework itself is made open source and everyone in the community can use it as-is or add more test cases, more test services and test datasets to it. The vision for this project is to grow to become a trusted open evaluation framework and code to evaluate AI Services offered by various vendors and universities around the world in various languages. This effort is also bringing together fair and open datasets for testing the services. This project so far has tested Sentiment and Speech-to-Text AI Services offered by a couple of vendors and is being expanded to various other services and languages. Screenshots below showcase the capabilities of the AI services evaluation framework.
Figure 2: Welcome Dashboard
Figure 2 above shows the welcome dashboard. The present capability is open for usage directly (with no need for login/registration). The user has an option to directly feed in the data sets (i.e inline in real -time) or upload the test data set by using “Work with test data” option.
The speech to text test option enables the user to directly feed in the speech through a microphone and test. Figure 3 below shows Speech-to-Text service evaluation.
Figure 3: Speech to text direct input capability
The user will be able to stop the recording and analyse the results at any point in time.
Sentiment analysis service evaluation framework provides the same direct and test data options. Figure 4 below shows the capability where the user is able to feed in the data inline for direct analysis for sentiments.:
Figure 4: Sentiment analysis direct input capability
The user as well is able to upload data in bulk for sentiment analysis. Figure 5 below depicts this feature:
Figure 5: Sentiment analysis “Working with test data” capability
The code for this AI services testing and evaluation framework is available on ISSIP Git hub and is open source. We welcome volunteers and students to contribute to this project by expanding the number of AI services tested and the languages in which they are tested.
Authors of this Blog Article and contributors to this work include Rama Akkiraju, Gandhi Sivakumar, Lakshmi Shanmugam, Malarvizhi Kandasamy, Rizwan Dudekula, Sarika Sinha, Kesav Viswanadha, Jim Spohrer, and Yassi Moghaddam.