1 of 34

Knowledge Federation in Catalyst

Lessons Learnt with Interoperability

2 of 34

The Catalyst consortium

( )

3 of 34

Goal: Shared model of Collective Intelligence

4 of 34

Goal: Ecosystem of Collective Intelligence tools

5 of 34

Shared model inspiration

Initial Inspirations

  • Concept maps (Compendium, debategraph)
  • Wiki

Naive concept map model: ideas and links, with name and description.

The concept map refers to ideas in a conversation.

6 of 34

Intuitions that informed the data model

  • Argument mapping with progressive formalization
    • Formal thinking is the end result, should not block input
    • Typing can be applied afterwards
    • Versioning
    • Idea merge/split
    • Problematization of links requires easy reification
  • Multiple views/descriptions (choral). Meaning is context-dependent.
    • Versioned views (maps)
    • Multilingual, multiple audiences, levels of language, etc.

None of that was done in Catalyst scope, but it informed the data model.

7 of 34

Notes on the document/idea dichotomy

A written document usually names many ideas, and refers to many more. Document boundaries do not match idea boundaries.

Wiki’s intuition was to align document and idea, and have the page’s title express the idea, and the page’s content as the canonical view of that idea. (But paragraphs can be autonomous: See PurpleNumbers, FedWiki paragraphs. Still syntaxic breakdown.)

Assembl uses quotes, and reassembles idea structure narratively in synthesis. Ideas live in an outline, outside of document structure. The canonical view includes the set of quotes, and quoted documents.

8 of 34

Multiple idea representations

Is there a canonical view? Or are idea representations Choral (Mike Caulfield)?

Social jargon (Ward Cunningham 2010): Hyperlink can have many destinations (disambiguation)

Federated Wiki: Link logic. Where the (internal) links land is an algorithm, and can be altered. Allows to choose version, destination site, etc.

9 of 34

Interoperability: Model overview

Ideas: Nodes and links

  • Compendium, GraphDB-like
  • Maps are also ideas
  • Links point to nodes (vs. hypergraph)

10 of 34

Interoperability: Model overview

Let’s correct those relationships

We can talk about Links.

In RDF terms, they are pre-reified in the model.

11 of 34

Interoperability: Model overview

Contributions

  • A document is not an idea
  • Posts are documents with containers (eg Forum) and reply structure

12 of 34

Interoperability: Model overview

Annotation and idea reference

Documents contain text runs that express an idea (aboutness)

(A document can refer to an idea without annotation)

13 of 34

Interoperability: Model overview

Agents have multiple profiles

Voting

  • Flexible model: includes liking, multiple choice...

Change history

  • Actor-target model: Who/when/what.
  • Useful for analytics
  • Reconstituting past versions was not in scope
  • Assembl has snapshots internally

14 of 34

Catalyst Interchange Format: Ontologies

Reuse of other ontologies:

  1. FOAF
  2. SIOC
  3. OpenAnnotation

Ontology alignment with AIF, IBIS-PrivateAlpha

JSON-LD as common standard

http://bit.ly/catalyst_interop0

15 of 34

Ecosystem components

CI Platforms host knowledge graphs and conversations

Services import data (as conversations or annotations)

Visualization widgets

Conversation analytics

16 of 34

Ecosystem components

17 of 34

Ecosystem components: platforms

Assembl (I4P)

Deliberatorium (MIT)

DebateHub (KMI)

LiteMap (KMI)

18 of 34

Use case: Data importation

  • IMAP (I4P)
  • RSS (I4P)
  • Social network (Facebook) importation (I4P)
  • CMS (drupal) importation (W)
  • Web annotation (I4P, OU)

Theory: Service->CIF, stored as such in platform

Practice: Service->Platform relational data, transformed to CIF on demand.

Consequence: Cannot use transformer without hosting platform.

(exception: Drupal plugin->JSON)

19 of 34

Use case: Visualization widgets

Edgesense (WI)

CIDashboard (KMI)

20 of 34

Use case: visualization widgets

Theory: CIF->Visualization, in a widget.

Practice: Data transformation delay was prohibitive (esp. if use inference)

Also: Lack of deep interaction between widget and platform. (Partial success with Web messaging.)

21 of 34

Use case: data exchange chain

22 of 34

Use case: Conversation analytics

Recommendation system based on analysis of the conversation

Quantitative markers for identified conversation breakdowns:�groupthink, issue immaturity, controversy…

Also attention mediation: interesting ideas/users based on �interest/agreement clusters, etc.

Work of Mark Klein�http://bit.ly/delib_analytics

23 of 34

Use case: Co-construction across platforms

Conversation tagging in Assembl, visual maps in LightMap.

Required round-trip CIF importation/exportation, not done in time.

History + 3-way merge in data structures.

24 of 34

What worked?

Each tool got used in real life. Real lessons learned.

Ecosystem much less so. (Though visualizations made good marketing material...)

25 of 34

What did not work?

  • Governance issues
    • Diverted technical resources in I4P => lack of leadership
    • Not enough technical resources elsewhere
    • Not enough time spent on tech coordination
    • University culture of research prototypes not focused on technical collaboration
  • Technical issues
    • Resistance to learning new technologies (RDF, ...)
    • Technical issues with RDF stack maturity (Virtuoso is now a swear word.)
    • Partial understanding of the implications of JSON-LD. (eg linguistic strings, multiple values, multi-classes, inference...)
    • Visualizations are useless without interaction, which won't work without deep integration. Not an easy boundary.

26 of 34

Early assumptions verified and new lessons learnt

  • Adoption is hard. Meet communities where they are & minimize friction.
    • But obstacles to interoperability are high. We mostly did ephemeral communities
    • Institutional, social and technical gatekeepers
  • Not every problem is a wicked problem
    • Knowledge mapping is demanding model. Community has to perceive value
  • Knowledge mapping can allow collective intelligence at scale
    • But to get that scale, we need to offer value at small scale
    • A lot of AI tools need scale before they’re useful
  • Social channels matter (≠ social networks!)
    • Before the community can think together, they have to feel they’re a community
    • Building trust: from Bohmian dialogue to Deep democracy.

27 of 34

What to develop next?

Input and Analytics are still the best boundaries. Round-trip would help.

  • Human Services ecosystem:
    • Community management
    • Concept mapping
  • Technical Services ecosystem:
    • Document/conversation/knowledge importation
    • Concept extraction and annotation
    • Argument coaching?

I would suggest not trying to automate everything. Let people be crowd-vetted into roles, and ideally be rewarded.

28 of 34

Minimal technical infrastructure

  1. Shared models: modernize Catalyst?
    • topics encompass idea structure
    • Extend quotes to more document types
    • Use activity streams for actions? Are actions optional?
  2. Cross-platform idea identity (sameAs)
    • Include provenance/history information?
    • Allow to show same idea on other platform (display URLs ≠ data URI)
    • Visualization happens on each platform

Compare to Federated Wiki patchwork objects (~synthesis?) and computed link. The idea that you can import an object without importing its dependency tree is important.

29 of 34

Beyond the minimum: Round-trip (synchronization)

How to handle remote change? 3-way Merge? In other words, if I have a complex object, I want to know which aspects were changed by me, which were taken from other process so I can handle merge properly.

Who keeps older version for base of 3-way merge?

Simplest implementation: Keep last-modified date. Part of REST anyway. And cache what you import.

But some platforms may offer memento functionality.

30 of 34

How much shared typology of ideas could we need?

  • Link polarity (ibis pro/con arguments)
    • Contradictions
  • Idea type constraints (prob. overdesign)
  • Instance, subpart, subclass, cause-effect, succession, arguments, …
  • Evaluations (likelihood, frequency, desirability, importance…)
  • Comparisons? (n-ary)

31 of 34

Push from analytics/data gathering

Analytics (could be process or platform) looks at discussion and gives recommendations. (We need to agree on format for recommendations.)

Issue: Sending the whole data to analytics engine to get recommendations is costly.

Also does not allow analytics to send a push signal.

32 of 34

Options for push

  1. Client subscribes to data model changes. (2-way pull)
  2. Send a flux of delta information, the client can have its own copy of data and keep it up-to-date
    • Requires a LOT of shared knowledge to apply updates.
  3. Client sends remote queries to server. Most efficient, requires most shared knowledge. Probably not realistic.
  4. Data dependency graph. (even more shared knowledge, but can be applied partially, and degrades well to case 1.)

33 of 34

Layers of abstraction

  • Object references with modification dates (allowing 3-way merge)
  • Privacy, action permissions
  • Data flow and dependency graph?
  • Ideas, documents and references, views
  • Activity streams
  • Filters and operations?
  • Widgets?
  • Idea structure
  • Intent model? Shared action modeling?
  • ?
  • Business models?

34 of 34

Coevolution?

Is it enough to mix-and-match tools and advisers?

We’re still exploring: beware of premature formalization

BUT instrument actions… and intents? Certainly record advice-response configurations. (response to ‘bot :)