Considering the implications of COUNTER release 5 (including Distributed Usage logging) & COUNTER standard for Research Data Usage Metrics
Most people agree with Open Access is a good thing to have, but a lot of the debate resolves over business models and the cost we should pay for it. Fundamentally, the question is how much is the output worth?
While it isn't perfect, Librarians use cost per use figures in negotiations and discussions on whether to renew something but how are cost per use figures usually calculated? Most libraries use the usage reports from COUNTER reports from COUNTER compliant publishers to calculate (as input in the denominator ) for cost per use.

For those who are not aware COUNTER stands for Counting Online Usage of Networked Electronic Resources and "is a standard that enables publishers and vendors to report usage of their electronic resources in a consistent way. This enables libraries to compare data received from different publishers and vendors."

Sample COUNTER Journal Report 1 - release 4
If you have ever looked at any journal cost per use figures, it's highly likely that usage comes from this report.
As such any change to the COUNTER standard is something worth watching closely.
The COUNTER standard is changing from Release 4 to 5
This year the COUNTER Code of Practice release 5 took effect superceding the older release 4. Of course not all publishers have updated to Release 5 yet and for those of us who are responsible for compiling such reports it can be a bit tricky handling usage reports that are using two different versions.
In particular if you are using the Exlibris Alma platform, the lack of SUSHI support (a standard for harvesting COUNTER reports) for Release 5 until 2Q 2019 can be even more damaging, as publishers like JSTOR that have gone ahead to support Release 5 while dropping support for Release 4, resulting in the lack of usage reports until Alma is upgraded.
COUNTER standards can be pretty confusing and Release 5 has upped the ante in some ways in terms of the variety of reports you can get but in this blog post I'm going to focus on two main changes in the COUNTER standard that will probably be interest to everyone and has implications on how the statistics for cost per use are calculated.
In particular, I'm going to cover Distributed Usage Logging (DUL) which aims to "count every download" and will allow publishers to track usage of their articles even if it is downloaded or used off the main publisher platforms (e.g. if used on Mendeley, Readcube, Repositories).

This usage will be reported in COUNTER reports going forward.
Also, in the past year, there has also been a lot of activity around research data as organizations including libraries grapple with the challenges of collecting and managing Research Data. Besides the challenges of discovery which I have written about, organizations such as Datacite in Projects like Making Data Count have started thinking about applying a uniform standard for measuring usage of research data in data repositories.
This has resulted in a COUNTER Code of Practice for Research Data Usage Metrics release 1. As you will see later there are some curious similarities and differences between this and the normal COUNTER standards that we are familar with.
A quick overview of COUNTER Release 5
This blog post obviously isn't going to run through every nuance of how COUNTER release 5 has changed, for that COUNTER has released a series of Foundational class tutorials you can find on Youtube.
But if you want one video that covers everything in reasonable depth, I recommend the following webinar a Deep Dive into COUNTER Code of Practice Release 5 . I will be using the slides for that presentation in the rest of this blog.
Deep Dive into COUNTER Code of Practice Release 5
But let me cover the basics.
Release 5 is divided into a couple of master reports - namely Title Master Report, Database Master report, Platform Master Report and Item Master Report. Each report has a few standard views but can be further customized.
For most of us, we will be looking at the title master report (see below) to study Journal and ebook usage.

Each of these views may have multiple Metric Types (see below) and most are reasonably self explanatory

One difference from COUNTER Release 4 is the use of terminology "investigation" and "requests" and the ability to obtain "unique" counts.

Essentially if downloads is what you are interested in, requests is what you are looking for.
Webinars on COUNTER 5 tend to go in great detail to show examples on what counts as unique requests or investigations or what do not, but essentially it is session based.

There's a lot more about COUNTER 5 if you really into the details (e.g. How searches in platforms are considered "regular" vs "Automated") but for our purposes this is all you need to know at the high level.
Journal Report 1 or JR1 from Release 4 doesn't mean what you think it means
One of the most interesting nuances about Journal downloads that passed me by was that the old standard JR1 (Journal Report 1) in COUNTER release 4 didn't quite mean what I thought it did.
If you look at the new Standard views for Title Report, you will notice that the closest equalvant to Journal Report 1 is Journal Requests (Excluding OA_Gold) - TR_J1.

I was curious why in Release 5 , TR_J1 made a point to include in the description it excluded Gold OA. So the logical question was,did the old equivalent - COUNTER JR1 in release 4 include Gold OA? Surprisingly it does!

As explained in the diagram above to reconcile the new TR_J1 report to the old JR1 from release 4 you would need to subtract away the Gold OA usage (JR1 GOA).
In other words in the past when you used JR1 directly as the denominator of cost per use calculations and you didn't remove the Gold OA usage you were in a sense "overcounting" usage (because you are paying only for paywall access and should only include usage of paywalled items) and hence getting a lower cost per use than you should have if you excluded Gold OA.
That said,this probably won't make a difference if the journal title you are tracking is 100% subscription and did not flip to a OA model.
I wonder though, what if it is a title that has a delayed embargo model where content older than say 12 or 48 months are made free to read? Could this be the use case the new TR_J1 - Journal Requests (excluding OA Gold) is designed to catch? Not sure.
In any case, for this reason TR_J1 in the new COUNTER report seems to be a much better default than the old JR1.
A major change - Aggregation of usage across various platforms using Distributed Usage logging (DUL)
But a far bigger change I think in COUNTER 5 was something that totally passed me by - Distributed Usage Logging (DUL). Roger Schonfeld was among the first to talk about this recently.
But what is DUL?

What platforms are we talking about? Elsevier seems to be very quick to support this and Mendeley sharing of PDF is one of the platforms supporting DUL. Another publisher that seems supportive is Digital Science as there are mentions of Readcube and possibly ResearchGate. The later seems to be possible now that Springer Nature syndicates content to ResearchGate.
DUL usage statistics is built into the COUNTER standard itself, while not a standard view it can be obtained from the Master Report. As you can see below you see a line for "DUL Mendeley".

Just to be clear, usage on Mendeley is tracked even in cases where a researcher stores a publisher version of record article in their private library and it is used there. For example, if the user himself clicks on it to download from his private library or if he shares the pdf privately to others in his mendeley group and someone in that group clicks on it, usage is also tracked.
On top of that usage via pdfs downloaded off the Mendeley web catalogue. is included too.

Implications of DUL

The implications of DUL reporting can be staggering. As you see in the diagram above, DUL reporting is envisoned to allow tracking usage anywhere including Social Networking Sites (probably referring to Academia.edu, ResearchGate), Institutional repositories and the catch all "Reading environment"?
One would imagine Elsevier which owns Bepress and the digital commons network of repositories might start to support DUL. How about if tools like Unpaywall, Kopernio, Lean Library browser and more also started to provide DUL statistics to publishers?
Some wild thoughts and questions about DUL
Leaving aside the privacy concerns , here are some wild thoughts and musings
Firstly how does Mendeley (or Readcube or ResearchGate or..) know which institution the user is from to assign usage? Is it simply based on the affiliations they declare in their profile? The IP range? How accurate is this?
Secondly we know that this tracking works only for items with dois but does this apply only to version of record articles? What about author accepted manuscripts? DUL is based on Crossref infrastructure and I believe standard Crossref practice for dois for author accepted manuscripts (unlike preprints) are supposed to be the same dois as the version of record, does that mean those are tracked too?
Some scenarios I'm mulling over
Imagine if someone from institution X shared an article on his Mendeley Group and then someone from institution Y downloaded it. Would usage be assigned to X or Y? Presumably Y? But it is not unlikely Y does not have a subscription to them so DUL seems to be unnecessary?
In fact, the whole idea of applying DUL for free articles not behind paywalls seems very odd to me. It might be nice to track but for cost per use figures should in my opinion exclude it.
Worse yet DUL applying to private sharing feels fundamentally wrong to me. Say I download a paper from Sciencedirect and puts it in my Mendeley library and as I do my research I may continously download/open it to read. It seems odd to count all those additional usage when I'm basically using Mendeley like my local folder.
Of course including DUL into usage has the potential to change the cost per use figure substantialy.
On first thought it seems crazy to me that libraries would want to include DUL figures as it will give more bargaining power to publishers by increasing usage and decreasing cost per use. It also seems fundamentally odd to me to include usage from free sources (if those are indeed included). But in Isn’t Leakage Good for Libraries? , Roger Schonfeld suggests some libraries might want to include DULs while others might not depending on their strategic goal!
I can't imagine why someone would want to do that, but as you will see later in a report that surveyed librarians it seems a slight majority would in fact include the higher usage!
What do librarians think about DUL?
Of course we cannot of course avoid thinking about privacy issues. As I write this Cody Hanson has done a study on third party code on Publisher platforms. He concluded "Upon evaluation of these audience tools and others, I conclude that many publisher platforms seek to maximize, rather than minimize, the library user identity information that gets associated with users’ behavior"
Would condoning DUL in repositories and PDF finding tools like Kopernio be worsening things?
According to this webinar and report on DUL, there seems to be quite a bit of demand from stakeholders for DUL including librarians.

In particular while 70% of librarians preferred the reports to be seperate , a slightly majority actually
"would use the higher number of consolidated usage across multiple platforms to calculate cost-per-use for the content they license from the publisher"!

https://www.projectcounter.org/distributed-usage-logging-report-stakeholder-demand/
To be there was quite a bit of free text comments, some saying the question wasn't clear, some saying they only wanted to count for paid for content and lastly some saying
"A use is a use, and the consolidated figure would seem a more accurate count of actual use. For
example, if one of my students downloaded an article and then shared it with her three groupmates
for a project, I'd want to count that as at least 4 uses. My current reporting structure would only
allow me to see 1 download, but the use could clearly be higher than that.”
A COUNTER Standard for Research Data
DUL on the face of it is about aggregation of usage. As a librarian , I have heard researchers wish that citations and downloads from various versions of their papers on different repositories could be aggregated so they could show the total impact of their work.
My understanding is DUL usage in COUNTER reports I have shown so far will not help with this as it will be at the journal title level. But what about the item master report?

This looks a lot more promising doesn't it? As it allows reports on the article level. Imagine a world if all the repositories or data repositories in the world supported the Code of Practice Release 5 and it was all aggregated at a central hub?
This is where the new COUNTER Code of Practice for Research Data Usage Metrics release 1 comes in.
But let's zoom out a bit and talk about Datacite and the Making Data Count Project.
Datacite and Making Data Count Project
The Project Making Data Count is a project by California Digital Library, Datacite and DataOne that intends to fill what they perceived has a gap with regards to existing data metrics efforts around research data.
As noted on their history page
"there are no community-driven standards for data usage stats; b) no open source tools for data centers to collect usage stats according to standards; c) and no central place to store, index and access data usage stats, together with other DLM, in particular data citations. Our current project aims to fulfill these gaps."
By Data level metrics (DLM), they refer to citations , social media mentions and usage downloads.
As there are other projects that handle citations of datasets (Scholix , Crossref/Datacite collaborations etc), a lot of focus now falls on creating a standard for usage downloads.
If you want a uptodate overview of the initatives around both citation and usage metrics for datasets I highly recommend Bringing Citations and Usage Metrics Together to Make Data Count

The key thing to note is that the idea here is not simply having a code of practice adapted for Research Data but in a ideal world, all data repositories will implement the code of practice, send the data to a central hub - such as Datacite which will aggregate the usage allowing librarians, researchers to pull such usage statistics via the Datacite event data API.
Thanks for this question. The goal is that each repository will standardize their usage against that standard, and then send their usage stats to @datacite Event Data. Once we have repositories sending to EventData we can do aggregations.
— Make Data Count (@makedatacount) March 26, 2019
I don't know which repository software or systems currently support this, but I can see there is some development for Dataverse, Zenodo and Dryad Implementations. I certainly hope others like Figshare will be joining them.
You can look through the Code of Practice for Research Data and compare it with the currently COUNTER practice 5, and you can see it is very similar. I see two main differences.
Firstly, research data has the unique issue of granularity handling the different aggregated or split subsets of a data is going to be tricky when it comes to citations and downloads.
Secondly it is stated that there is no use case for tracking by institution since research data is generally freely available.... The standard does allow tracking by geographic location (Country/Region) but not institution. This is of course a major difference with the usual COUNTER standard of course where tracking by institution use is pretty much THE point of the standard.
Conclusion
I must admit though did attend webinars on COUNTER 5 last year, somehow DUL passed me by. Still it's not too late. Librarians are as usual in a tight spot. We claim one of our values is protecting user privacy but in practice the usage statistics seems so useful......

