Conversations are becoming distributed and fragmented on the Web. Content is increasingly syndicated and re-aggregated beyond its original context. Technologies such as RSS, Atom, and PubSubHubbub allow for a real time flow of updates to readers, but this leads to a fragmentation of conversations. The comments, ratings, and annotations increasingly happen at the aggregator and are invisible to the original source.
The Salmon Protocol is an open, simple, standards-based solution that lets aggregators and sources unify the conversations. It focuses initially on public conversations around public content.
There is a detailed specification for Salmon available, along with a separate specification for the signature mechanism. Please refer to those specifications for the most up to date information.
A source provides an RSS/Atom feed of content. It includes a Salmon link in its feed:
An aggregator reads the feed (ideally via a push mechanism such as PubSubHubbub), and sees from the link that it is Salmon-enabled. It remembers the endpoint URL for later use.
When an aggregator's user leaves a comment on a feed item, the aggregator stores the comment as usual, and then also POSTs a salmon version of it to the source's Salmon endpoint:
The source responds to the salmon with standard HTTP codes - 2xx for OK, 4xx for input problem, 5xx for source / server error. The usual result is for the salmon to be published along with other comments on the source's web page. Note that sources are not obligated to actually publish the salmon -- they may moderate them, spam block them, aggregate or analyze them instead. However, if the source does publish the salmon in a comment feed, it has to maintain certain fields to make the protocol work end-to-end.
These requirements kick in if a source republishes salmon alongside native comments, and are intended as traffic signals to ensure smooth operation of the protocol for everybody.
The end result of the protocol is that sources and aggregators can co-operate to present a unified view of the global conversation around any topic represented by an RSS/Atom feed item.
Users should be made aware of the publishing scope of the comments they leave. For some aggregators, this may be implied (all data is public), for others a warning or a checkbox may be necessary. We suggest enabling Salmon only when the original content is itself publicly visible. For simplicity, Salmon does not attempt to deal with private data or distributed access control, though these can be addressed in future extensions.
A major concern with this type of distributed protocol is how to prevent spam and abuse. Salmon provides building blocks to allow in-depth defense against attacks. Specifically, every salmon has a verifiable author and user agent. The basic security flow when salmon swims upstream looks like this:
The flow can get more complicated, especially if the aggregator is not also providing identity services for the user.
As a convenience, anyone can run a salmon validator service that does step 3 as a public service. Anyone who is willing to trust the salmon validator service can use it. So in the simplest possible case, depending on a validator service and not using OAuth to verify the sending service, the flow can be:
This requires the recipient only to understand the data format and have an https library. The service is simply a convenience, not a central mechanism; the actual validation is always done via the public key signature contained in the salmon element, using the mechanism described in the Magic Signatures specification. Thus recipients do not need to depend on the validator service.
The Salmon validation step is intended as a first line of defense that lets other reputation and rate limiting mechanisms kick in. That is, it allows recipients to assign a fixed quota to authors,and IdPs, and to block those who are exceeding reasonable limits; it allows them to build up reputations for all three, to white and blacklist, and to federate as needed; and it allows third parties to double check the results as well (if a source simply makes up salmon comments, they will not validate; if it tampers with comments, they can be correlated via IDs and the tampering detected and exposed).
For flexibility and interoperability, salmon may be modified in reasonable ways before republishing. For example, truncating salmon to fit within a services' length limit; translating character set encodings; and even translation into another language would all be reasonable. But in all of these cases the me:provenance element will contain the original data as well.
Signatures are generated per the the Magic Signatures specification.
Salmon is intended to be Activity Stream-compatible. Salmon endpoints should be able to accept Activity Stream activities as well as generic Atom formatted comments -- in fact a generic Atom formatted comment is also a valid activity in the current AS spec. Liking, rating, and linking to content all contributes to conversations. Note that the endpoint may not understand or accept all types of activities, and we would like to have a way for an endpoint to declare up front what kinds of activities it is prepared to accept (if for no other reason, to avoid bothering end users with checkboxes that won't do anything).
We believe there should be an alternative JSON format for salmon, and hope that we can simply adopt the Activity Streams JSON format.
Salmon supports RSS streams, and could also allow POSTing of an RSS formatted salmon. The exact details need to be worked out, but a natural representation would be:
(See references in the specification documents.)
John Panzer (email@example.com, http://www.abstractioneer.org)