In a time of increasing distrust and defiance of citizens towards artificial intelligence, the scientific community is not left untainted by the recent scandals and must provide an answer to a justified concern of the society. OLKi wants to take part in that næscent effort by ensuring datasets and particularly linguistic datasets that are the essential resource of machine learning techniques can be easily hosted and interacted with for the whole scientific community, with the twofold objective of transparency and understandability of research data.
The OLKi project will connect researchers, scientists and citizens through the Fediverse. To do so it will extend the ActivityStreams language to support scientific research datasets and their related activities for the web, and create the OLKi platform to feature it.
The Fediverse is originally a social-network-oriented text communication platform that has emerged as a citizen initiative in 2017. It has quickly grown, both in number of users (more than 2.5 million users in November 2018) and supported functionalities, including for instance blog-like platforms (Plume), image sharing (Pixelfed), video sharing (PeerTube), audio sharing (funkwhale), etc. all while maintaining the basics for compatibility and interoperability (Pleroma being one of the main examples). One of the main strengths of the Fediverse is that all of these features are interconnected together through open standards, enabling users of one feature to access other features as well. A major side-effect is that web applications don't have to compete with each other but can understand each other (with a degree of compatibility). It might seem insignificant since it just amounts to the classic gains of using open standards. However in the case of social interactions, it allows even small and novel software to free themselves from Metcalfe's law by gaining support from existing communities.
The OLKi project will further enhance the Fediverse with a new functionality oriented towards scientific communication and research results dissemination. This will fill the current need of a dedicated global and open communication platform between researchers, but also between scientists and citizens. A need that is currently (partially) covered by sites such as Reddit, although we argue they are unfit for the role. In that sense it will be more than a platform to host datasets and provide features that current dataset hosting platforms lack.