Figshare API Resources

The information on this page and this site has moved to Figshare's help site: https://help.figshare.com/article/how-to-use-the-figshare-api

You can use the Figshare API to automate record management, gather statistics, or even build custom web apps for users or administrators.

Note: Interested in migrating records in batch? This is certainly possible through the API. However, if you are a repository manager at an institution, the Batch Management Tool may be an easier option for you.

On this page:

Advice and guidance
Basic examples
Retrieve information for multiple items
Example workflows
- Create an author report
- Create records, upload files, and publish
- Batch format and upload metadata from a source
- Delete all or some of the items in an account (useful when testing in stage or if you have lots of uneeded draft records.
Administrator workflows and reporting
Custom Web Applicaitons

Advice and Guidance

Understanding the API documentation
Formatting JSON inputs and outputs
A variety of other scripts can be found here and this help page links to Google Colab versions of some of those scripts
There are many other GitHub repos for Figshare related workflows or integrations
More on stats API endpoints

Examples

Retrieve data

Any public metadata or files can be retrieved through the API without authentication. In the example below, the full metadata record for an item is retrieved. The ITEM_ID is the number at the end of any item’s URL.

Python:

import json
import requests
#Set the base URL and ITEM_ID
BASE_URL = 'https://api.figshare.com/v2'
ITEM_ID = 123456
#Retrieve public metadata from the endpoint
s=requests.get(BASE_URL + '/articles/' + str(ITEM_ID))
#Load the metadata as JSON
metadata=json.loads(s.text)
#View the metadata
print(json.dumps(metadata, indent=2))

Output:

{
  "files": [
    {
      "id": 3074543,
      "name": "2011-02-15_14-24-56-228.png",
      "size": 470683,
      "is_link_only": false,
      "download_url": "https://ndownloader.figshare.com/files/3074543",
      "supplied_md5": "eb5c9c6b278b533f98aa02feae57c266",
      "computed_md5": "eb5c9c6b278b533f98aa02feae57c266"
    }
  ],
  "custom_fields": [],
  "authors": [
    {
      "id": 155524,
      "full_name": "Yunyan Deng",
      "is_active": false,
      "url_name": "_",
      "orcid_id": ""
    },
    {
      "id": 155526,
      "full_name": "Jianting Yao",
      "is_active": false,
      "url_name": "_",
      "orcid_id": ""
    },
    {
      "id": 155529,
      "full_name": "Xiuliang Wang",
      "is_active": false,
      "url_name": "_",
      "orcid_id": ""
    },
    {
      "id": 110581,
      "full_name": "Hui Guo",
      "is_active": false,
      "url_name": "_",
      "orcid_id": ""
    },
    {
      "id": 155532,
      "full_name": "Delin Duan",
      "is_active": false,
      "url_name": "_",
      "orcid_id": ""
    }
  ],
  "figshare_url": "https://plos.figshare.com/articles/dataset/Transcriptome_Sequencing_and_Comparative_Analysis_of_Saccharina_japonica_Laminariales_Phaeophyceae_under_Blue_Light_Induction/123456",
  "description": "<div><h3>Background</h3><p>Light has significant effect on the growth and development of <em>Saccharina japonica</em>, but there are limited reports on blue light mediated physiological responses and molecular mechanism. In this study, high-throughput paired-end RNA-sequencing (RNA-Seq) technology was applied to transcriptomes of <em>S. japonica</em> exposed to blue light and darkness, respectively. Comparative analysis of gene expression was designed to correlate the effect of blue light and physiological mechanisms on the molecular level.</p> <h3>Principal Findings</h3><p>RNA-seq analysis yielded 70,497 non-redundant unigenes with an average length of 538 bp. 28,358 (40.2%) functional transcripts encoding regions were identified. Annotation through Swissprot, Nr, GO, KEGG, and COG databases showed 25,924 unigenes compared well (E-value <10<sup>\u22125</sup>) with known gene sequences, and 43 unigenes were putative BL photoreceptor. 10,440 unigenes were classified into Gene Ontology, and 8,476 unigenes were involved in 114 known pathways. Based on RPKM values, 11,660 (16.5%) differentially expressed unigenes were detected between blue light and dark exposed treatments, including 7,808 upregulated and 3,852 downregulated unigenes, suggesting <em>S. japonica</em> had undergone extensive transcriptome re-orchestration during BL exposure. The BL-specific responsive genes were indentified to function in processes of circadian rhythm, flavonoid biosynthesis, photoreactivation and photomorphogenesis.</p> <h3>Significance</h3><p>Transcriptome profiling of <em>S. japonica</em> provides clues to potential genes identification and future functional genomics study. The global survey of expression changes under blue light will enhance our understanding of molecular mechanisms underlying blue light induced responses in lower plants as well as facilitate future blue light photoreceptor identification and specific responsive pathways analysis.</p> </div>",
  "funding": null,
  "funding_list": [],
  "version": 1,
  "status": "public",
  "size": 470683,
  "created_date": "2012-06-27T00:57:36Z",
  "modified_date": "2016-01-19T09:09:40Z",
  "is_public": true,
  "is_confidential": false,
  "is_metadata_record": false,
  "confidential_reason": "",
  "metadata_reason": "",
  "license": {
    "value": 1,
    "name": "CC BY 4.0",
    "url": "https://creativecommons.org/licenses/by/4.0/"
  },
  "tags": [
    "transcriptome",
    "sequencing",
    "comparative",
    "induction"
  ],
  "categories": [
    {
      "id": 69,
      "title": "Inorganic Chemistry",
      "parent_id": 38,
      "path": "",
      "source_id": "",
      "taxonomy_id": 10
    },
    {
      "id": 13,
      "title": "Genetics",
      "parent_id": 48,
      "path": "",
      "source_id": "",
      "taxonomy_id": 10
    },
    {
      "id": 61,
      "title": "Developmental Biology",
      "parent_id": 48,
      "path": "",
      "source_id": "",
      "taxonomy_id": 10
    }
  ],
  "references": [],
  "has_linked_file": false,
  "citation": "Deng, Yunyan; Yao, Jianting; Wang, Xiuliang; Guo, Hui; Duan, Delin (2016): Transcriptome Sequencing and Comparative Analysis of <em>Saccharina japonica</em> (Laminariales, Phaeophyceae) under Blue Light Induction. PLOS ONE. Dataset. https://doi.org/10.1371/journal.pone.0039704",
  "is_embargoed": false,
  "embargo_date": null,
  "embargo_type": null,
  "embargo_title": "",
  "embargo_reason": "",
  "embargo_options": [],
  "id": 123456,
  "title": "Transcriptome Sequencing and Comparative Analysis of <em>Saccharina japonica</em> (Laminariales, Phaeophyceae) under Blue Light Induction",
  "doi": "10.1371/journal.pone.0039704",
  "handle": "",
  "url": "https://api.figshare.com/v2/articles/123456",
  "published_date": "2012-06-27T00:57:36Z",
  "thumb": "https://s3-eu-west-1.amazonaws.com/ppreviews-plos-725668748/3074543/thumb.png",
  "defined_type": 3,
  "defined_type_name": "dataset",
  "group_id": 107,
  "url_private_api": "https://api.figshare.com/v2/account/articles/123456",
  "url_public_api": "https://api.figshare.com/v2/articles/123456",
  "url_private_html": "https://figshare.com/account/articles/123456",
  "url_public_html": "https://plos.figshare.com/articles/dataset/Transcriptome_Sequencing_and_Comparative_Analysis_of_Saccharina_japonica_Laminariales_Phaeophyceae_under_Blue_Light_Induction/123456",
  "timeline": {
    "posted": "2012-06-27T00:57:36",
    "firstOnline": "2016-01-19T09:09:40"
  },
  "resource_title": "Transcriptome Sequencing and Comparative Analysis of <em>Saccharina japonica</em> (Laminariales, Phaeophyceae) under Blue Light Induction",
  "resource_doi": "10.1371/journal.pone.0039704"
}

Authenticate

Authentication is required for any endpoint that retrieves or accepts private or institutional information. A token can be created for any user account and provides access in line with the account’s privileges. In the example below, a user retrieves 10 basic metadata records from their personal account. These records may include both public and private (draft) records. Note that the results are limited to 10 by using the page and page_size parameters.

Python:

import json
import requests
#Set the base URL
BASE_URL = 'https://api.figshare.com/v2'
#Set the token for the header
api_call_headers = {'Authorization': 'token ENTER-TOKEN'} #example: {'Authorization': 'token dkd8rskjdkfiwi49hgkw...'}
#Retrieve basic metadata for 10 items your account owns
r=requests.get(BASE_URL + '/account/articles?page=1&page_size=10', headers=api_call_headers)
#Load the metadata as JSON
metadata=json.loads(r.text)
#View the metadata
print(metadata)

Output:

[
  {
    "id": 8365555,
    "title": "Testing-full embargo-restrict access-unpublish-republish",
    "doi": "10.0166/FK2.stagefigshareare.8365555",
    "handle": "",
    "url": "https://api.figshare.com/v2/account/articles/8365555",
    "published_date": "2023-04-06T13:13:41Z",
    "thumb": "",
    "defined_type": 3,
    "defined_type_name": "dataset",
    "group_id": 9991,
    "url_private_api": "https://api.figshare.com/v2/account/articles/8365555",
    "url_public_api": "https://api.figshare.com/v2/articles/8365555",
    "url_private_html": "https://figshare.com/account/articles/8365555",
    "url_public_html": "https://faber.figshare.com/articles/dataset/Testing-full_embargo-restrict_access-unpublish-republish/8365555",
    "timeline": {
      "posted": "2023-04-06T13:13:41",
      "firstOnline": "2023-04-06T13:09:40"
    },
    "resource_title": null,
    "resource_doi": null
  },
  {
    "id": 8356369,
    "title": "My example dataset",
    "doi": "10.0166/FK2.stagefigshareare.8356369",
    "handle": "",
    "url": "https://api.figshare.com/v2/account/articles/8356369",
    "published_date": "2023-04-04T21:36:57Z",
    "thumb": "",
    "defined_type": 3,
    "defined_type_name": "dataset",
    "group_id": 9991,
    "url_private_api": "https://api.figshare.com/v2/account/articles/8356369",
    "url_public_api": "https://api.figshare.com/v2/articles/8356369",
    "url_private_html": "https://figshare.com/account/articles/8356369",
    "url_public_html": "https://faber.figshare.com/articles/dataset/My_example_embargoed_record_-_institutional_repository_version/8356369",
    "timeline": {
      "posted": "2023-04-04T21:36:57",
      "firstOnline": "2023-04-04T21:36:01"
    },
    "resource_title": null,
    "resource_doi": null
  },
  {
    "id": 8351662,
    "title": "Untitled Item",
    "doi": "",
    "handle": "",
    "url": "https://api.figshare.com/v2/account/articles/8351662",
    "published_date": null,
    "thumb": "",
    "defined_type": 0,
    "defined_type_name": "",
    "group_id": 9991,
    "url_private_api": "https://api.figshare.com/v2/account/articles/8351662",
    "url_public_api": "https://api.figshare.com/v2/articles/8351662",
    "url_private_html": "https://figshare.com/account/articles/8351662",
    "url_public_html": "https://faber.figshare.com/articles/dataset/_/8351662",
    "timeline": {},
    "resource_title": null,
    "resource_doi": null
  },
  {
    "id": 8332638,
    "title": "Ant behaviour data set",
    "doi": "10.0166/FK2.stagefigshareare.8332638",
    "handle": "",
    "url": "https://api.figshare.com/v2/account/articles/8332638",
    "published_date": "2023-03-24T15:16:19Z",
    "thumb": "https://s3-eu-west-1.amazonaws.com/testfigsharearepreviews.figshareare.com/830360616/thumb.png",
    "defined_type": 3,
    "defined_type_name": "dataset",
    "group_id": 9991,
    "url_private_api": "https://api.figshare.com/v2/account/articles/8332638",
    "url_public_api": "https://api.figshare.com/v2/articles/8332638",
    "url_private_html": "https://figshare.com/account/articles/8332638",
    "url_public_html": "https://faber.figshare.com/articles/dataset/Ant_behaviour_data_set/8332638",
    "timeline": {
      "posted": "2023-03-24T15:16:19",
      "firstOnline": "2023-03-24T15:16:19"
    },
    "resource_title": null,
    "resource_doi": null
  },
  {
    "id": 8329418,
    "title": "A versioned item - unpublished - republished",
    "doi": "10.0166/FK2.stagefigshareare.8329418",
    "handle": "",
    "url": "https://api.figshare.com/v2/account/articles/8329418",
    "published_date": "2023-04-06T20:35:25Z",
    "thumb": "https://s3-eu-west-1.amazonaws.com/testfigsharearepreviews.figshareare.com/830356944/thumb.png",
    "defined_type": 2,
    "defined_type_name": "media",
    "group_id": 9991,
    "url_private_api": "https://api.figshare.com/v2/account/articles/8329418",
    "url_public_api": "https://api.figshare.com/v2/articles/8329418",
    "url_private_html": "https://figshare.com/account/articles/8329418",
    "url_public_html": "https://faber.figshare.com/articles/media/A_versioned_item_-_unpublished_-_republished/8329418",
    "timeline": {
      "posted": "2023-04-06T20:35:25",
      "firstOnline": "2023-03-23T18:42:20"
    },
    "resource_title": null,
    "resource_doi": null
  },
  {
    "id": 8327074,
    "title": "Data supporting wood ants behaviour",
    "doi": "10.0166/FK2.stagefigshareare.8327074",
    "handle": "",
    "url": "https://api.figshare.com/v2/account/articles/8327074",
    "published_date": "2023-03-22T09:57:21Z",
    "thumb": "https://s3-eu-west-1.amazonaws.com/testfigsharearepreviews.figshareare.com/830355168/thumb.png",
    "defined_type": 3,
    "defined_type_name": "dataset",
    "group_id": 9991,
    "url_private_api": "https://api.figshare.com/v2/account/articles/8327074",
    "url_public_api": "https://api.figshare.com/v2/articles/8327074",
    "url_private_html": "https://figshare.com/account/articles/8327074",
    "url_public_html": "https://faber.figshare.com/articles/dataset/Data_supporting_wood_ants_behaviour/8327074",
    "timeline": {
      "posted": "2023-03-22T09:57:21",
      "firstOnline": "2023-03-22T09:57:21"
    },
    "resource_title": "A motion compensation treadmill for untethered wood ants (Formica rufa): evidence for transfer of orientation memories from free-walking training",
    "resource_doi": "10.1242/jeb.228601"
  },
  {
    "id": 8275510,
    "title": "DRI Sample Report.pdf",
    "doi": "10.0166/FK2.stagefigshareare.8275510",
    "handle": "",
    "url": "https://api.figshare.com/v2/account/articles/8275510",
    "published_date": "2023-04-07T16:12:14Z",
    "thumb": "",
    "defined_type": 3,
    "defined_type_name": "dataset",
    "group_id": 9991,
    "url_private_api": "https://api.figshare.com/v2/account/articles/8275510",
    "url_public_api": "https://api.figshare.com/v2/articles/8275510",
    "url_private_html": "https://figshare.com/account/articles/8275510",
    "url_public_html": "https://faber.figshare.com/articles/dataset/DRI_Sample_Report_pdf/8275510",
    "timeline": {
      "posted": "2023-04-07T16:12:14",
      "firstOnline": "2023-04-07T16:12:14"
    },
    "resource_title": null,
    "resource_doi": null
  },
  {
    "id": 8271566,
    "title": "Exhibition B Image",
    "doi": "10.0166/FK2.stagefigshareare.8271566",
    "handle": "",
    "url": "https://api.figshare.com/v2/account/articles/8271566",
    "published_date": "2023-02-20T15:09:10Z",
    "thumb": "https://s3-eu-west-1.amazonaws.com/testfigsharearepreviews.figshareare.com/830323572/thumb.png",
    "defined_type": 2,
    "defined_type_name": "media",
    "group_id": 9991,
    "url_private_api": "https://api.figshare.com/v2/account/articles/8271566",
    "url_public_api": "https://api.figshare.com/v2/articles/8271566",
    "url_private_html": "https://figshare.com/account/articles/8271566",
    "url_public_html": "https://faber.figshare.com/articles/media/Exhibition_B_Image/8271566",
    "timeline": {
      "posted": "2023-02-20T15:09:10",
      "firstOnline": "2023-02-20T15:09:10"
    },
    "resource_title": null,
    "resource_doi": null
  },
  {
    "id": 8262674,
    "title": "Ireland pub C18th",
    "doi": "10.0166/FK2.stagefigshareare.8262674",
    "handle": "",
    "url": "https://api.figshare.com/v2/account/articles/8262674",
    "published_date": "2023-02-16T11:29:15Z",
    "thumb": "https://s3-eu-west-1.amazonaws.com/testfigsharearepreviews.figshareare.com/830318778/thumb.png",
    "defined_type": 1,
    "defined_type_name": "figure",
    "group_id": 9991,
    "url_private_api": "https://api.figshare.com/v2/account/articles/8262674",
    "url_public_api": "https://api.figshare.com/v2/articles/8262674",
    "url_private_html": "https://figshare.com/account/articles/8262674",
    "url_public_html": "https://faber.figshare.com/articles/figure/Ireland_pub_C18th/8262674",
    "timeline": {
      "posted": "2023-02-16T11:29:15",
      "firstOnline": "2023-02-16T11:29:15"
    },
    "resource_title": null,
    "resource_doi": null
  },
  {
    "id": 8260098,
    "title": "Slides",
    "doi": "10.0166/FK2.stagefigshareare.8260098",
    "handle": "",
    "url": "https://api.figshare.com/v2/account/articles/8260098",
    "published_date": null,
    "thumb": "https://ndownloader.figshare.com/files/830317022/preview/830317022/thumb.png",
    "defined_type": 7,
    "defined_type_name": "presentation",
    "group_id": 9999,
    "url_private_api": "https://api.figshare.com/v2/account/articles/8260098",
    "url_public_api": "https://api.figshare.com/v2/articles/8260098",
    "url_private_html": "https://figshare.com/account/articles/8260098",
    "url_public_html": "https://faber.figshare.com/articles/presentation/Slides/8260098",
    "timeline": {},
    "resource_title": null,
    "resource_doi": null
  }
]

Send data

Sending information through a POST or PUT endpoint is accomplished by adding a ‘data’ variable to the request. The contents of the data variable needs to be formatted as indicated by the documentation for the API endpoint. In the example below, a new record is added to the account that created the token.

Python:

import json
import requests
#Set the base URL
BASE_URL = 'https://api.figshare.com/v2'
#Set the token in the header
api_call_headers = {'Authorization': 'token ENTER-TOKEN'} #example: {'Authorization': 'token dkd8rskjdkfiwi49hgkw...'}
#Create json formatted for upload
sample_metadata = {"title":"Test metadata for upload","keywords":["biodiversity","invertebrate"]}
json_metadata = json.dumps(sample_metadata)
#Create a private item
r = requests.post(BASE_URL + '/account/articles', headers=api_call_headers, data = json_metadata)
if r.status_code == 201: #If Post was successful
  print('Successfully created item')

Retrieve information for multiple items

This section describes how to use multiple API calls to retrieve metadata or files from multiple items. The basic idea is to:

Create a list of item ids
- Retrieve item ids in your account
- Retrieve item ids through a search query
Loop through the list and gather the information needed for each item id

Retrieve item ids in your account

import json
import requests
#Retrieve list of private metadata for 50 items. This is for unpublished and published records.
#Set the base URL
BASE_URL = 'https://api.figshare.com/v2'  <-------- Change figshare to figsh if working with the sandbox 
#Set the token in the header
api_call_headers = {'Authorization': 'token ENTER-TOKEN'} #example: {'Authorization': 'token dkd8rskjdkfiwi49hgkw...'}
#Get items owned by account
r=requests.get(BASE_URL + '/account/articles?page=1&page_size=50', headers=api_call_headers) 
items=json.loads(r.text)
if r.status_code != 200:
    print('Something is wrong:',r.content)
else:
    print('Collected',len(items),'metadata records')
#Create a list of all the item ids
item_ids = [item['id'] for item in items]  

Retrieve item ids through a search query

This example searches for records with that use the category ‘Digital Humanities’ or ‘Environmental Humanities’

import json
import requests
BASE_URL = 'https://api.figshare.com/v2'
categories = ["'History and Philosophy of the Humanities'","'Environmental Humanities'"]
#Gather basic metadata for items (articles) that meet your search criteria
results = [] #create a blank list
for i in categories:
    query = '{"search_for":":category: ' + i + '"}'
    y = json.loads(query) #Figshare API requires json paramaters
    #The number of results is unknown but you can collect up to 9,000 results. This sets the page size to 1000 results and calls the API 
    #to retrieve 9 pages of results
    for j in range(1,10):
        records = json.loads(requests.post(BASE_URL + '/articles/search?page_size=1000&page={}'.format(j), params=y).content)
        results.extend(records) #add the retrieved records to the list of records
#See the number of items
print(len(results),'items retrieved')

#Create a list of all the item ids
item_ids_full = [item['id'] for item in results]  

#Remove duplicates by converting to a dictionary and back to a list
item_ids = list( dict.fromkeys(item_ids_full) ) 
print(len(item_ids_full)-len(item_ids),'duplicate records removed,',len(item_ids),'unique records remain')

Retrieve item ids in a group

If you are at an institution (or not!), you may want to gather all the items in a particular group. The script below does just that. Note: It only returns items published in the specific group. It does not return results for subgroups

BASE_URL = "https://api.figshare.com/v2"

INST_ID = 319 #Institution ID, this example is for Iowa State University. You don't have to include this- Group is is enough
GRP_ID = 11962 #Group ID, this example is for the Iowa State University Agriculture Group.

#Gather basic metadata for items (articles) that meet your search criteria
results = [] #create a blank list
query = '{"institution": ' + str(INST_ID) + ',"group": ' + str(GRP_ID) + '}'
y = json.loads(query) #Figshare API requires json paramaters

#You can collect up to 9,000 results. This sets the page size to 1000 results and calls the API 
#to retrieve 9 pages of results. You can use this endpoint to get the number of items in 
#a group: https://docs.figshare.com/#stats_count_articles
for j in range(1,10):
    records = json.loads(requests.post(BASE_URL + '/articles/search?page_size=1000&page={}'.format(j), params=y).content)
    results.extend(records) #add the retrieved records to the list of records

#See the number of items
print(len(results),'items retrieved')

#Create a list of all the item ids
item_ids = [item['id'] for item in results] 

Now gather information for those item ids

Full metadata

Use one of the methods above to create a list of item ids called item_ids. This script includes an option to add a token in the header in case some of the item ids are for unpublished records.

import json
import requests

#Set the token in the header  if you have some private items in your list
#api_call_headers = {'Authorization': 'token ENTER-TOKEN'} #example: {'Authorization': 'token dkd8rskjdkfiwi49hgkw...'}

#---INSERT CODE TO COLLECT ITEM IDS HERE----

#Set the base URL
BASE_URL = 'https://api.figshare.com/v2'

full_records = [] #Create a blank list to hold the JSON metadata records
for id_value in item_ids: 
    r=requests.get(BASE_URL + '/account/articles/' + str(id_value), headers=api_call_headers)
    metadata=json.loads(r.text)
    full_records.append(metadata) #Add the collected metadata record to the master list of records

print(len(full_records),metadata records collected)

Views and Downloads

The options for views, downloads, and shares are described here: https://docs.figshare.com/#stats. Note that the endpoints for Breakdown and Timeline require a special administrator authentication that you can request for your institution through https://support.figshare.com.

In the example below, the total views and downloads for each item id are collected. Note that this script uses the pandas package to create a table of values.

import json
import requests
import pandas as pd 

#---INSERT CODE TO COLLECT ITEM IDS HERE----

#Set the base URL
BASE_URL = 'https://stats.figshare.com'

stats = [] #This list will hold a dictionary for each item_id

for i in item_ids:
    #make two api calls to retrieve total views and total downloads
    r=requests.get(BASE_URL + '/total/views/article/'+ str(i))
    views=json.loads(r.text)
    r=requests.get(BASE_URL + '/total/downloads/article/'+ str(i))
    downloads=json.loads(r.text)
    #Add the item_id, views, and downloads as a dictionary and add to the stats list.
    item_stats = {"item_id":i,"total_views":views['totals'],"total_downloads":downloads['totals']}
    stats.append(item_stats)
    
#Convert the list of dictionaries to a dataframe
df = pd.DataFrame(stats)

#Show the top three rows of the dataframe
df.head(3)

Output:

item_id	total_views	total_downloads
13259588	243	54
1138718	7552	3030

Download file(s)

There are several ways to download files through the API. Each file that is part of a record has it’s own download URL. You can find this URL in the full metadata retrieved through the API or you can enter the item id at this endpoint: https://docs.figshare.com/#article_files. For item https://doi.org/10.6084/m9.figshare.5616409.v3, the download URL for the file is https://figshare.com/ndownloader/files/9778696. Visiting that URL will automatically start the download (File is 219KB).

Python:

import json
import requests
from pathlib import Path

#---INSERT CODE TO COLLECT ITEM IDS HERE----

#Set the base URL
BASE_URL = 'https://api.figshare.com/v2'

file_info = [] #a blank list to hold all the file metadata

for i in item_ids:
    r = requests.get(BASE_URL + '/articles/' + str(i) + '/files')
    file_metadata = json.loads(r.text)
    for j in file_metadata: #add the item id to each file record- this is used later to name a folder to save the file to
        j['item_id'] = i
        file_info.append(j) #Add the file metadata to the list

#Download each file to a subfolder named for the article id and save with the file name
for k in file_info:
    response = requests.get(BASE_URL + '/file/download/' + str(k['id']), headers=api_call_headers)
    Path(str(k['item_id'])).mkdir(exist_ok=True)
    open(str(k['item_id']) + '/' + k['name'], 'wb').write(response.content)
    
print('All done.')

Output: The download script will save files to folders named for the item id the files belong to picture of a folder window with two folders of downloaded files

Example workflows

Create an author report

The API can provide information for individual authors. As an example, we created a Jupyter Notebook in Google Colab.

Note The script retrieves statistics and if you want to do this for an account within an intsitutional repository, you need to adjust the stats urls in the script and will need your institution’s stats credentials. See the notes in the script. Also see the stats info in the ‘Advice and Guidance’ section at the top of the page.

Anyone can run this script, though you’ll need to sign into a Google account to do so. Then you must only add an author name or an ORCID. The script produces a table with metadata for the author’s outputs, views and downloads, and produces a plotly map of views for all the records (stats will only be for records stored in figshare.com accounts).

Here is the order of operations:

Import libraries and set base API URL
Perform search for name or ORCID using this endpoint: https://docs.figshare.com/#articles_search Then create a dataframe
Gather views and downloads for each record from this endpoint: https://docs.figshare.com/#stats_totals Then create a dataframe
Merge the metadata and stats dataframes
Optionally collect all the same information for Collections using the collection endpoints
Format the date information (extract dates from the JSON and add them to the dataframe
Gather views by country for each item using this endpoint: https://docs.figshare.com/#stats_breakdown
Map the views
Save the dataframe and you can copy the map images from the browser

Create records, upload files, and publish

A common use of the API is to create metadata records from existing metadata. For example you may have metadata harvested from somewhere else that you want to include in your repository as metadata only or linked records. Or you may have metadata and files that you want to add to an account or repository. This section will assume you have already formatted the metadata for upload.

Create a metadata only or linked file record

To create records using the API, you need to use three endpoints: 1) create a private record, optionally add a link to a file, and then publish the record.

This Python example will create a linked file record for a paper- effectively it is a metadata only record that points users to the publisher’s DOI.

import json
import requests

#Set the token in the header and base URL
#Open a token stored in a text document:
text_file = open("symp-figsh-token.txt", "r") #Change this file name
TOKEN = text_file.read()
TOKEN.strip() #removes any hidden spaces
text_file.close()

#Or add your token as plain text (if you do, don't share this script with anyone):
#api_call_headers = {'Authorization': 'token ' + TOKEN}

#Set the base URL
BASE_URL = 'https://api.figshare.com/v2' #To use stage instance replace with 'https://api.figsh.com/v2'

#Create a json file with 1 record
jsonfile = {
  "title": "Producing performance-advantaged bioplastics",
  "description": "Abstract: A grand challenge for bio-based plastics is the ability to cost-effectively manufacture high-performance polymers directly from renewable resources that are also recyclable-by-design. A one-step conversion of xylose to polyesters has been reported, combining a sustainable lifecycle with impressive materials performance.",
  "tags": [
    "bioplastics"
  ],
  "categories": [
      614,
	  1058
  ],
  "authors": [
    {
	"name":"Robin M. Cywar"
	},
	{
      "id": "Gregg Beckham"
    }
  ],
  "defined_type": "journal contribution",
  "license": 1,
  "doi": "",
  "handle": "",
  "resource_doi": "10.1038/s41557-022-01030-y",
  "resource_title": "Producing performance-advantaged bioplastics",
    "timeline": {
    "firstOnline": "2022-11-26"
  },
  "funding": "National Renewable Energy Laboratory",
  "custom_fields": {
    "Journal / Parent title": "Nature Chemistry",
	"Volume": "14",
	"Pagination":"967-969"
  }
}

#You can also open a json file if you prefer:
#with open("beckham-record.json", "r", encoding='utf8') as read_file: #Replace this with the filename of your choice
#    jsonfile = json.load(read_file)

#Upload the record

record_fails = []

jsonresult = json.dumps(jsonfile) #Takes one record and makes it a json string (double quotes)
r = requests.post(BASE_URL + '/account/articles', headers=api_call_headers, data = jsonresult)
if r.status_code != 201:
    record_fails.append(str(r.content[0:75])) #Add failed index to list with partial description
else:

    #Remove the admin account as an author by updating the record just created
    #This uses the article url returned by the API response (r)
    authordict = {}
    authordict['authors'] = jsonfile['authors']
    authorjson = json.dumps(authordict) #formats everything with double quotes
    response_json = json.loads(r.content)
    new_url = response_json['location']
    s = requests.put(new_url, headers=api_call_headers, data = authorjson) 
    
    #Upload a link as a file
    link = '{"link":"https://doi.org/'+ jsonfile['resource_doi'] +'"}'
    tup = new_url.rpartition('/') #From end of URL, split string '/' so we can get the article id
    article_id = str(tup[2]) #Then select the id from the resulting tuple.
    t = requests.post(BASE_URL + '/account/articles/' + str(article_id) +'/files', headers=api_call_headers, data = link)

    #Publish the record
    u = requests.post(BASE_URL + '/account/articles/' + str(article_id) +'/publish', headers=api_call_headers)
    
        
#print(len(record_fails),"record fails. Failed record indexes:",record_fails)
print('The record is created and available here:',json.loads(u.content)['location'])

Upload files

The process to upload files may seem complex but this is necessary to handle extremely large files. So whether your file is a few kilobytes or a terabyte, this process will work.

The overall process to upload files looks like this: 1) create a private record, upload file(s), and then publish the record.

As described in the upload documentation, there are four steps to uploading: 1) Initiate the upload 2) Receive the number of file parts, 3) Upload the file parts, and 4) Complete the upload. The documentation linked above has a sample script in several languages. A Jupyter Notebook and Python example is downloadable from this Google Colab file and is reproduced below:

BASE_URL = 'https://api.figshare.com/v2/{endpoint}'
TOKEN = 'ENTER TOKEN BETWEEN QUOTES'
CHUNK_SIZE = 10485760 #about 10MB
FILE_PATH = 'FILE NAME' #If the file is in the same folder as this notebook, put the name of the file in quotes. E.g. 'file-name.mp4'
TITLE = 'Uploaded through API' #This is the title of the item you want to upload to.
#Define functions that split the file, initiate upload, and complete the upload

def raw_issue_request(method, url, data=None, binary=False):
    headers = {'Authorization': 'token ' + TOKEN}
    if data is not None and not binary:
        data = json.dumps(data)
    response = requests.request(method, url, headers=headers, data=data)
    try:
        response.raise_for_status()
        try:
            data = json.loads(response.content)
        except ValueError:
            data = response.content
    except HTTPError as error:
        print('Caught an HTTPError: {}'.format(error.message))
        print('Body:\n', response.content)
        raise

    return data


def issue_request(method, endpoint, *args, **kwargs):
    return raw_issue_request(method, BASE_URL.format(endpoint=endpoint), *args, **kwargs)


def list_articles():
    result = issue_request('GET', 'account/articles')
    print('Listing current articles:')
    if result:
        for item in result:
            print(u'  {url} - {title}'.format(**item))
    else:
        print('  No articles.')



def create_article(title):
    data = {
        'title': TITLE  # You may add any other information about the article here as you wish.
    }
    result = issue_request('POST', 'account/articles', data=data)
    print('Created article:', result['location'], '\n')

    result = raw_issue_request('GET', result['location'])

    return result['id']


def list_files_of_article(article_id):
    result = issue_request('GET', 'account/articles/{}/files'.format(article_id))
    print('Listing files for article {}:'.format(article_id))
    if result:
        for item in result:
            print('  {id} - {name}'.format(**item))
    else:
        print('  No files.')

    print


def get_file_check_data(file_name):
    with open(file_name, 'rb') as fin:
        md5 = hashlib.md5()
        size = 0
        data = fin.read(CHUNK_SIZE)
        while data:
            size += len(data)
            md5.update(data)
            data = fin.read(CHUNK_SIZE)
        return md5.hexdigest(), size


def initiate_new_upload(article_id, file_name):
    endpoint = 'account/articles/{}/files'
    endpoint = endpoint.format(article_id)

    md5, size = get_file_check_data(file_name)
    data = {'name': os.path.basename(file_name),
            'md5': md5,
            'size': size}

    result = issue_request('POST', endpoint, data=data)
    print('Initiated file upload:', result['location'], '\n')

    result = raw_issue_request('GET', result['location'])

    return result


def complete_upload(article_id, file_id):
    issue_request('POST', 'account/articles/{}/files/{}'.format(article_id, file_id))


def upload_parts(file_info):
    url = '{upload_url}'.format(**file_info)
    result = raw_issue_request('GET', url)

    print('Uploading parts:')
    with open(FILE_PATH, 'rb') as fin:
        for part in result['parts']:
            upload_part(file_info, fin, part)


def upload_part(file_info, stream, part):
    udata = file_info.copy()
    udata.update(part)
    url = '{upload_url}/{partNo}'.format(**udata)

    stream.seek(part['startOffset'])
    data = stream.read(part['endOffset'] - part['startOffset'] + 1)

    raw_issue_request('PUT', url, data=data, binary=True)
    print('  Uploaded part {partNo} from {startOffset} to {endOffset}'.format(**part))


def main():
    # We first create the article
    list_articles()
    article_id = create_article(TITLE)
    list_articles() #Prints the list again and you should see your new item at the top of the list
    list_files_of_article(article_id)

    # Then we upload the file.
    file_info = initiate_new_upload(article_id, FILE_PATH)
    # Until here we used the figshare API; following lines use the figshare upload service API.
    upload_parts(file_info)
	# We return to the figshare API to complete the file upload process.
    complete_upload(article_id, file_info['id'])
    list_files_of_article(article_id)

#Run this to upload the file.
if __name__ == '__main__':
    main()

Upload metadata and files in batch

If your existing metadata includes a URL or local location for the associated file(s) you can upload the metadata and files. Your existing metadata should be a list of dictionaries already formatted for upload (see the formatting page for advice). Use a for loop to upload the metadata for each record, then upload the file(s).

Batch format and upload metadata from a source

This offers an example of creating lnked file records based on harvested metadata. The resulting items clearly indicate the URL or DOI the user can click to find the original record. The example below is for paper metadata, but it can easily work with data metadata too. Macquarie University created a script to create catalog records in their repository from Dryad records.

Here are the steps to format metadata harvested from the Dimensions database and create linked file records:

Open a json file (in the code below, an example file from Dimensions is manually created)
Pull out the relevant fields and give them the proper keys (account for partial dates, author formatting, and missing abstracts)
Convert the json record to a string with double quotes
Upload the record
Update the author list of the new record (removes the admin account as an author)
Add the existing DOI as a linked file

This can upload to a specific group with specific custom metadata. You can change the api key to upload to different accounts or use impersonation.

Here is an example Python script. Note that this does not add Categories or Keywords which are required for publishing.

import json
import requests
import pandas as pd


#Set the token directly (don't share this notebook with this information)
TOKEN = "ENTER TOKEN HERE WITH QUOTES"

#Set the token in the header and base URL (change the file name and location)
#text_file = open("../../../testing-token-symlilliput.txt", "r")
#TOKEN = text_file.read()
#TOKEN.strip() #removes any hidden spaces
#text_file.close()


api_call_headers = {'Authorization': 'token ' + TOKEN}

#Set the base URL
BASE_URL = 'https://api.figshare.com/v2'  #<--------- If using a sandbox, change this to 'https://api.figsh.com/v2'

#For demonstration, create a json file with 2 records- this is json formatted metadata from the Dimensions API.
#Feel free to change the information below so you can easily find these examples if using a shared sandbox
jsonfile = [
	{
		"abstract": "As robots are deployed to work in our environments, we must build appropriate expectations of their behavior so that we can trust them to perform their jobs autonomously as we attend to other tasks. Many types of explanations for robot behavior have been proposed, but they have not been fully analyzed for their impact on aligning expectations of robot paths for navigation. In this work, we evaluate several types of robot navigation explanations to understand their impact on the ability of humans to anticipate a robot’s paths. We performed an experiment in which we gave participants an explanation of a robot path and then measured (i) their ability to predict that path, (ii) their allocation of attention on the robot navigating the path versus their own dot-tracking task, and (iii) their subjective ratings of the robot’s predictability and trustworthiness. Our results show that explanations do significantly affect people’s ability to predict robot paths and that explanations that are concise and do not require readers to perform mental transformations are most effective at reducing attention to the robot.",
		"authors": [
			{
				"affiliations": [
					{
						"city": "Pittsburgh",
						"city_id": 5206379,
						"country": "United States",
						"country_code": "US",
						"id": "grid.147455.6",
						"name": "Carnegie Mellon University",
						"raw_affiliation": "Carnegie Mellon University, Pittsburgh, PA, USA",
						"state": "Pennsylvania",
						"state_code": "US-PA"
					}
				],
				"corresponding": "",
				"current_organization_id": "",
				"first_name": "Stephanie",
				"last_name": "Rosenthal",
				"orcid": [],
				"raw_affiliation": [
					"Carnegie Mellon University, Pittsburgh, PA, USA"
				],
				"researcher_id": "null"
			},
			{
				"affiliations": [
					{
						"city": "Pittsburgh",
						"city_id": 5206379,
						"country": "United States",
						"country_code": "US",
						"id": "grid.147455.6",
						"name": "Carnegie Mellon University",
						"raw_affiliation": "Carnegie Mellon University, Pittsburgh, PA, USA",
						"state": "Pennsylvania",
						"state_code": "US-PA"
					}
				],
				"corresponding": "",
				"current_organization_id": "",
				"first_name": "Peerat",
				"last_name": "Vichivanives",
				"orcid": [],
				"raw_affiliation": [
					"Carnegie Mellon University, Pittsburgh, PA, USA"
				],
				"researcher_id": "null"
			},
			{
				"affiliations": [
					{
						"city": "Pittsburgh",
						"city_id": 5206379,
						"country": "United States",
						"country_code": "US",
						"id": "grid.147455.6",
						"name": "Carnegie Mellon University",
						"raw_affiliation": "Carnegie Mellon University, Pittsburgh, PA, USA",
						"state": "Pennsylvania",
						"state_code": "US-PA"
					}
				],
				"corresponding": "",
				"current_organization_id": "grid.147455.6",
				"first_name": "Elizabeth",
				"last_name": "Carter",
				"orcid": [
					"0000-0002-2735-148X"
				],
				"raw_affiliation": [
					"Carnegie Mellon University, Pittsburgh, PA, USA"
				],
				"researcher_id": "ur.012471523754.61"
			}
		],
		"date": "2022-12-31",
		"doi": "10.1145/3526104",
		"id": "pub.1147121755",
		"title": "The Impact of Route Descriptions on Human Expectations for Robot Navigation",
		"year": 2022
	},
	{
		"abstract": "Data-driven network intrusion detection (NID) has a tendency towards minority attack classes compared to normal traffic. Many datasets are collected in simulated environments rather than real-world networks. These challenges undermine the performance of intrusion detection machine learning models by fitting machine learning models to unrepresentative “sandbox” datasets. This survey presents a taxonomy with eight main challenges and explores common datasets from 1999 to 2020. Trends are analyzed on the challenges in the past decade and future directions are proposed on expanding NID into cloud-based environments, devising scalable models for large network data, and creating labeled datasets collected in real-world networks.",
		"authors": [
			{
				"affiliations": [
					{
						"city": "Pittsburgh",
						"city_id": 5206379,
						"country": "United States",
						"country_code": "US",
						"id": "grid.147455.6",
						"name": "Carnegie Mellon University",
						"raw_affiliation": "Carnegie Mellon University, Pittsburgh, PA",
						"state": "Pennsylvania",
						"state_code": "US-PA"
					}
				],
				"corresponding": "",
				"current_organization_id": "",
				"first_name": "Dylan",
				"last_name": "Chou",
				"orcid": [],
				"raw_affiliation": [
					"Carnegie Mellon University, Pittsburgh, PA"
				],
				"researcher_id": "null"
			},
			{
				"affiliations": [
					{
						"city": "Notre Dame",
						"city_id": 8469294,
						"country": "United States",
						"country_code": "US",
						"id": "grid.131063.6",
						"name": "University of Notre Dame",
						"raw_affiliation": "University of Notre Dame, Notre Dame, Indiana",
						"state": "Indiana",
						"state_code": "US-IN"
					}
				],
				"corresponding": "",
				"current_organization_id": "grid.131063.6",
				"first_name": "Meng",
				"last_name": "Jiang",
				"orcid": [
					"0000-0002-3009-519X"
				],
				"raw_affiliation": [
					"University of Notre Dame, Notre Dame, Indiana"
				],
				"researcher_id": "ur.010456457135.66"
			}
		],
		"date": "2022-12-31",
		"doi": "10.1145/3472753",
		"id": "pub.1141731091",
		"title": "A Survey on Data-driven Network Intrusion Detection",
		"year": 2022
	}
]

#Format records for upload.

result = []
doi_list = []
for item in jsonfile:
    my_dict={}
    my_dict['title']=item.get('title')
    if 'abstract' in item: #abstract isn't always present
        my_dict['description']=item.get('abstract')
    else:
        my_dict['description']="No description available"
    authors = [] #format authors
    for name in item['authors']:
        authorname = {"name" : name['first_name'] + " " + name['last_name']}
        authors.append(authorname)
    my_dict['authors']= authors
    my_dict['defined_type'] = 'journal contribution'
    my_dict['doi']= item.get('doi')
    my_dict['resource_doi']= item.get('doi')
    my_dict['resource_title']=item.get('title')
    #my_dict['references'] = [item.get('URL')]
    my_dict['timeline'] =  {"firstOnline" : str(item['year']) + "-01-01"} #year only 
    #my_dict['is_metadata_record'] = True #Use these if you want a metadata only record
    #my_dict['metadata_reason'] = 'See publisher version'
    result.append(my_dict)
    doi_list.append(item['doi'])


print(len(result),"records are ready for upload.")


#Upload the records

record_fails = []
partial_record_ids = []
created_record_ids = [] #Use this to delete all the draft records if needed 
success_count = 0
count = 0 #This just tracks what index value the loop is on and is used to connect the metadata with the DOI update

for index, item in enumerate(result):
    jsonresult = json.dumps(item) #Takes one record and makes it a json string (double quotes)
    r = requests.post(BASE_URL + '/account/articles', headers=api_call_headers, data = jsonresult)
    if r.status_code != 201:
        record_fails.append(str(index) + ":" + str(r.content[0:75])) #Add failed index to list with partial description
        count += 1
    else:
        count += 1 #increment here otherwise have to do it at each if statement below
        #Remove the admin account as an author by updating the record just created
        #This uses the article url returned by the API response (r)
        
        # Get the location URL and item id
        response_json = json.loads(r.content)
        new_url = response_json['location']
        item_id = response_json['entity_id']
        created_record_ids.append(item_id)
        
        
        #Format and update authors
        authordict = {}
        authordict['authors'] = item['authors']
        authorjson = json.dumps(authordict) #formats everything with double quotes
        s = requests.put(new_url, headers=api_call_headers, data = authorjson) 
        if r.status_code != 201:
            record_fails.append(str(index) + "failed at author update:" + str(r.content[0:75])) #Add failed index to list with partial description         
            partial_record_ids.append(item_id)
        else:
            #Upload a link as a file
            link = '{"link":"https://doi.org/'+ str(doi_list[count-1]) + '"}' #count-1 because already incremented value to next index
            t = requests.post(new_url +'/files', headers=api_call_headers, data = link)
            if r.status_code != 201:
                record_fails.append(str(index) + "failed at doi update:" + str(r.content[0:75])) #Add failed index to list with partial description
                partial_record_ids.append(item_id)
            else:
                success_count += 1

        
print(success_count,"records created and updated. There were",len(result)-len(created_record_ids),"records that were not created at all.")
print('There were',len(partial_record_ids),'records created but with a failed DOI link.')
print("Failed record descriptions:",record_fails)

Delete account items

You may have many draft items in your account, or if you are an admin, perhaps you’ve been testing with the Batch Management Tool, and you want to remove items.

Important Note: You cannot delete public items. For figshare.com users, you must contact Figshare to unpublish an item. For institution admins, this script provides a way to get a list of item ids to unpublish.

The following script uses a token from your account to get a list of both public and private/draft items in your account. If you only want to delete the current private items, you can pass the entire list of item ids to Figshare for deletion and Figshare will ignore requests to delete public items. If you are an admin who wants to delete everything, you’ll need to paste the list of item ids into your admin unpublish interface (in the Administration area). Figshare will unpublish the public items and then you can finish running the script to delete all the item ids.

import json
import requests
import pandas as pd
import csv

#Set the token directly (delete the lines above)
TOKEN = 'ENTER TOKEN BETWEEN SINGLE QUOTES'

api_call_headers = {'Authorization': 'token ' + TOKEN}

#Set the base URL
BASE_URL = 'https://api.figsh.com/v2' <---If using a production repository or figshare.com account change this to https://api.figshare.com/v2


#Now, gather records from your repository.

repository_items = []
num_pages = 1  #------> Change this as needed, but the code is set to gather 1000 records.

for i in range(1,num_pages+1):
    item = json.loads(requests.get(BASE_URL + '/account/articles?page_size=1000&page={}'.format(i), headers=api_call_headers).content)
    repository_items.extend(item)
print(len(repository_items),'metadata records collected')

df = pd.DataFrame(repository_items)
article_ids = df['id'].tolist()

print("If you need to unpublish items, paste these ids (no square brackets) in your admin interface:",article_ids)


#You may want to split this next block of code from the above blocks so that you can run it after unpublishing item ids.

delete_article_fails = []

for article in article_ids:
    r = requests.delete(BASE_URL + '/account/articles/' + str(article), headers=api_call_headers)
    if r.status_code != 204:
        delete_article_fails.append(str(article) + ":" + str(r.content[0:75])) #Add failed index to list with partial description

print(len(delete_article_fails), 'errors!')
print('Fail descriptions:',delete_article_fails)

Retrieve metadata for items in review

There are several API endpoints that provide review/curation information. Reviewers can sort and inspect items in review through the GUI. But maybe you’d like to see a spreadsheet of the metadata for items that have a ‘pending’ status for review. This Google Colab notebook (also available as a Jupyter notebook) does just that. Be sure to use an admin token (not just a reviewer account token).

Here are the steps in that notebook:

Retrieve a list of items with pending status in review
Visit each item and retrieve all the metadata. Also add the owner name and email.
Create a spreadsheet and do some formatting: dates, authors, add item owner, separate out custom fields.
Save the spreadsheet

What if I want to publish those items?

You may want to bulk publish a group of items but avoid having to review them all. This isn’t possble through the API since that would defeat the purpose of review. However, you can use the Admin Batch Management tool to bulk publish items with review (making the assumption you reviewed the metadata in the CSV!). To do this you need to create a CSV with the item ids to publish. You can do this for pending review items that are in the spreadsheet from above. Here are the steps:

Save a copy of the spreadsheet downloaded above as CSV (important: if using Excel, save as CSV rather than CSV UTF-8. The latter sometimes adds additional characters in the file)
Delete all the items (rows) that you don’t want to publish
Delete all columns except ‘id’
Rename the ‘id’ column to ‘article_id’ and save the file
You should now have a CSV with only one column of numbers titled ‘article_id’
Log in to an administrator Figshare account that has review privileges.
Navigate to the Batch Management tool and select your CSV as the source file. Select the option to automatically review items.

Impersonating user accounts

Impersonating accounts is described in this section of the documentation. Adding ‘impersonate=account_id’ to either the request URL for GET or in the body of a POST or PUT request will impersonate the account indicated. Here is a brief exaple that changes the author on an item owned by a user account and then publishes the item:

import json
import requests

#Set the token in the header and base URL
text_file = open("./././testing-token.txt", "r") #Paste your token in a text file and save it where this notebook is
TOKEN = text_file.read()
TOKEN.strip() #removes any hidden spaces
text_file.close()


api_call_headers = {'Authorization': 'token ' + TOKEN}

#Set the base URL
BASE_URL = 'https://api.figshare.com/v2' #Change this to 'https://api.figsh.com/v2' if you want to test in stage

#Get the author info from this endpoint: https://docs.figshare.com/#private_institution_accounts_list
new_user_id = 12345 #this is the user_id in the endpoint output
account_id = 6789 #This is 'id' in the endpoint output and is the account id for impersonation

#Set up the json to replace the author(s)
authors = {'authors': [{'id': new_user_id}], 'impersonate': account_id}
author_json = json.dumps(authors)

#Store the item id to be edited (you could also loop through a list of ids)
item_id = 65432 
    
s = requests.put(BASE_URL + '/account/articles/' + str(item_id), headers=api_call_headers, data = author_json) 
if s.status_code != 205:
	print(str(s.content)) #print error
else:
	#Publish the record
    body = '{"impersonate":' + str(account_id) + '}' #impersonate to publish
    u = requests.post(BASE_URL + '/account/articles/' + str(i) +'/publish', headers=api_call_headers, data = body)
    if u.status_code != 201:
        print(str(u.content)) print error
    else:
        print('Published!')

Create a repository dashboard

This is an example of how one might use the API to create a database for reporting purposes. The script downloads metadata from your repository along with views by item, views by country, and views by month. Eight tables are saved as worksheets within a Google Sheet. This sheet can be regularly updated so that any dashboards built off it will show current information.

Note: You need to have two sets of credentials:

A repository administrator token
The credentials for the stats API (ask Support if you do not have them yet)

The Jupyter Notebook is available and you can run it in Colab. Here are the high level steps in the process:

Import libraries
Set variables and metadata API credentials
Retrieve all item ids and pull out only public item ids
Retrieve full metadata and total views for each item id
Create dataframes for each type of information: items, authors, funding, categories, and tags. Each dataframe includes the item id.
Set variables and credentials for the stats API
Use the stats API to retrieve views by country and over time.
Either create a Google sheet or update an existing one

Visualize the data

You should be able to use the resulting spreadsheet as a data source for you favorite visualization software. We made a very basic proof of concept Looker Studio dashboard based on data from a live repository.

You can search for your institution’s name and do a high level analysis of the records out there. As an example, we created a Jupyter Notebook in Google Colab.

Anyone can run this script. One must only add search criteria and run all the cells in sequence. You can also save all the data as a Google sheet in your Google Drive.

Here is the order of operations:

Import libraries and set base API endpoint and institution name
Search using this endpoint: https://docs.figshare.com/#articles_search and create a dataframe
Collect the full metadata for each item and create separate lists for item metadata, authors, funding, categories, tags, and files.
Do some formatting and make dataframes from all the lists
Start analysis by looking at the main dataframe columns
Look at how many records are from institutions (using the existence of group_id)
Item types with views
Look at licenses
See a list of linked funders (linked to Dimensions records)
See grant titles
See the top 20 tags
See the top 20 categories
See top 10 viewed items
Export all the data to a Google sheet that you can then use for visualizations or other analysis

Custom Web Apps

You can create custom interfaces to display information from your Figshare account or repository. Online tools, like ChatGPT, can help. This was demonstrated in the API workshop Figshare ran at the 2023 Open Repositories Conference. You can see the resources for this demonstration on the workshop page and the workshop slides.

Figshare API Resources

Advice and Guidance

Examples

Retrieve data

Authenticate

Send data

Retrieve information for multiple items

Retrieve item ids in your account

Retrieve item ids through a search query

Retrieve item ids in a group

Now gather information for those item ids

Full metadata

Views and Downloads

Download file(s)

Example workflows

Create an author report

Create records, upload files, and publish

Create a metadata only or linked file record

Upload files

Upload metadata and files in batch

Batch format and upload metadata from a source

Delete account items

Retrieve metadata for items in review

What if I want to publish those items?

Impersonating user accounts

Create a repository dashboard

Visualize the data

Search for records related to your institution

Custom Web Apps