A Great Year for Collaborative Revolutions

November 4, 2011 Leave a comment

As we are nearing its end, it seems fair to say that 2011 was a great year for peaceful revolutions. First of all, the many pivotal transformations in North Africa excited and spurred watershed changes in parts of the world that – to those looking from the outside – seemed doomed to be run by dictators. Similarly, and perhaps inspired by this, the Occupy Wall Street movement has spread to cities throughout the US and Europe: my personal favorite is ‘Occupy Winschoten’, where about 40 people demonstrated this October ‘for change’ at the local bank building of this 18,000 pop. town in the far Northeast of Holland [1, in Dutch]. These revolutions are possibly caused, and certainly enabled by the ubiquitous presence of information technologies and access to unprecedented knowledge and communication channels. As Zbigniew Brzezinski put it: “The nearly universal access to radio, television and increasingly the Internet is creating a community of shared perceptions… Millions of students are revolutionaries-in-waiting, already semi-mobilized in large congregations, connected by the Internet.” [2]

In our own area of scholarly communication, 2011 also seems to be a transformative year for coordinating and creating lasting change in the way we publish science and the humanities. I attended seven events this year – and know of at least three more – that not just discussed, but concretely helped shape and accelerate the rate of change in scholarly publishing (see list at the bottom). At each event, I learned about many new efforts, and everywhere, there was shared mood of excitement about the change afoot. Overall, I felt a sense that we are collectively transforming how the vast collection of knowledge in science, medicine, and the humanities available online can be accessed and integrated, and who has access to it. Through individual and joint efforts by all the parties in the information cycle – scientists, authors, editors, publishers, data repositories, libraries and software producers, and the lay public – we are developing tools and opportunities for creating, sharing and accessing knowledge in a way that improves science, medicine and the humanities, and allows new members to participate in these endeavors, including ‘the South’ and the general public.

This year of transformation started with a workshop in San Diego entitled ‘Beyond the PDF’, (see also an earlier blog post [3]). One of the great insights, and outcomes, of the workshop to me was that in an open discussion between scientists, librarians and publishers, we agreed that there is value is the services each of us offers, and it is a good idea to sit down and work together on a model that will enable sustained and efficient forms of publishing, going forward. We did not solve the closed/open access debate, but at least we were openly starting it and seeing many examples of innovation in scholarly publishing that include a role for many traditional, and several new parties.

In May, I attended an event at Harvard where one of the focal areas was how, in this day and age, the value and impact of a scientist can truly be assessed. The Provost of Harvard was one of the key participants here, and it was fascinating to think about the challenge that someone faces who can hire basically anyone in any field. What are the qualities you look for, and how are these measured? Educational skills, managerials skills and a charismatic personality are obviously all good traits for a full-tenured Harvard professor to have, but how do you weigh these against more traditional measures – or even assess them, in the first place? What role does technology play in expressing and accessing scientific impact on a personal, departmental or institutional level?

In June, the Elsevier Computer Science publishing group organized the Executable Papers Challenge in Singapore. This well-attended workshop at ICCS2011 offered an impressive overview of the breadth and maturity of efforts that various groups have developed to enable the execution and validation of software and data inside publications, and offered an exciting glimpse into a future where you don’t just read a paper – you run it.

And last week I attended a meeting on Transforming Scholarly Publishing, co-hosted by Microsoft and Harvard (see also Dave De Roure’s blog, [4]). One of the most amazing parts of this workshop was the mind-boggling overview of tools given during the first day, a whirlwind tour of 19 tools, platforms and solutions that are currently transforming scholarly publishing. De Roure’s breakout group added to this list by creating list of a further set tools – a list well worth following, and helping to maintain [5].

But the most intense meeting for me this year was one I helped organise: the Dagstuhl Perspectives workshop in August dubbed (during the meeting) “Force11: the Future of Research Communication and e-Scholarship 2011”. Held at the spectacularly brain-stimulating Dagstuhl Castle, and sponsored by the Leipzig Institute, Force11 was a meeting of scholars, librarians, archivists, publishers and research funders who, individually and collectively, aim to bring about a change in scholarly communication through the effective use of information technology. As a key outcome this month we published the ‘Force11 Manifesto’, a description of the issues those at the workshop defined that impede change, and a vision and plan on how to overcome these impediments [6]. This vision focuses on two aspects of the scholarly publication: firstly, exploring and experimenting with new forms, and secondly, developing new models of access, credit and attribution.

The first part entails experimenting with new, enriched forms of scholarly publications consisting of rich and interconnected relationships between knowledge, claims and data. It requires the creation of a platforms to create and share computationally executable components, such as workflows, computer code and statistical calculations as scientifically valid pieces of content, and the development of an infrastructure that allows these components to be made accessible, reviewed, referenced and attributed. To do this, we have to develop best practices for depositing research datasets in repositories that enable linking to relevant documents and have high compliance levels driven by appropriate incentives, resources and policies. For the scientific domain, new forms of publication must facilitate the reproducibility of results: the ability to preserve and re-perform executable workflows or services. This will require us to reconstruct the context within which these objects were created and track them as data objects that evolve through time. In this way, the content of communications about research will follow the same evolutionary path that we have seen for general web content: a move from the static to the increasingly dynamic, and from top-down articles to grass-roots blogs. It also means revisiting the narrative structure of scholarly papers, and identifying portions where this narrative may be more structured for improved computational access, without losing the strong cognitive impact that a good story can have.

The second component of the Force11 vision requires changes to the complex socio-technical and commercial ecosystem of scholarly publishing. In particular, to obtain the benefits that networked knowledge promises, we have to explore reward systems that encourage scholars and researchers to participate and contribute to these efforts. This means acknowledging that a journal impact factor is a poor surrogate for measuring the true impact of scholarship and increasingly irrelevant in a world of disaggregated knowledge units of vastly varying granularity. It requires deriving new mechanisms that allow us to measure the true contribution a particular record of scholarship makes to the world’s store of knowledge. It also requires all those involved in the scholarly information life cycle to acknowledge that current business models are no longer adequate to support the rich, variegated, integrated and disparate knowledge offerings which new technologies enable, new scholarship requires, and new players in the scholarly field (including non-Western countries and the general public) deserve. In a collaboration involving scholars, publishers, libraries, funding agencies, academic institutions, and software developers, we need to develop models that can enable this exciting future to develop, while offering sustainable forms of existence for the constituent parties.

A great outcome attending of attending all of these meetings was that I was approached at the Microsoft workshop last week by several librarians who want to join Force11, and working together to redefine ‘the research library of the future’. As with the Northern African overthrown governments, as we create new systems to create and access science and the humanities we need to build up new structures to govern and keep track of them. But there seem to be a great amount of ideas, tools, and enthusiasm to do this work, and a willingness and interest to do this as a collaborative effort. I am very much looking forward to strengthening the connections made this year, and working on plans to help build these fundamentally new platforms, in 2012.


[1] http://www.dvhn.nl/nieuws/groningen/article8540311.ece/Demonstratie-Occupy-in-Winschoten-en-Groningen
[2] “Are We Witnessing the Start of a Global Revolution?” Global Research, January 27, 2011: http://www.globalresearch.ca/index.php?aid=22963&context=va
[3] Blog on beyond the PDF: http://elsevierlabs.wordpress.com/2011/01/25/back-from-beyond/
[4] See Dave De Roure’s blog at http://blogs.nature.com/eresearch/2011/10/
[5] Tools list by Dave De Roure at http://msrworkshop.tumblr.com/tagged/platforms
[6] Current version of the Force11 manifesto is at http://bit.ly/force11; website is http://force11.org

Ten workshops about transforming scholarly communication in 2011 (not an exhaustive list):
− January: A semantic, molecular future: http://www.rsc.org/ConferencesAndEvents/conference/alldetails.cfm?evid=107519
− January: Beyond the PDF – https://sites.google.com/site/beyondthepdf/
− May: Harvard eScience Workshop – http://osc.hul.harvard.edu/dss/program
− May: Royal Dutch Academy Open Data Day - http://www.knaw.nl/Pages/DEF/30/303.html
− June: Executable Paper Challenge - http://www.executablepapers.com/
− June: Alt-Metrics: Tracking scholarly impact on the social Web - http://altmetrics.org/workshop2011/
− August: Force11 – http://force11.org/
− August: Data Attribution and Citation Workshop – http://sites.nationalacademies.org/PGA/brdi/PGA_064019
− September: Science Online London – http://www.scienceonlinelondon.org/
− October: Microsoft Research eScience workshop – http://research.microsoft.com/en-us/events/escience2011-scholarly-communications/agenda.aspx

Anita de Waard, November 4, 2011

Publishing’s Perfect Storm…

October 7, 2011 Leave a comment

There is a perfect storm forming in the publishing sector,
and not the negative one of impending doom and gloom that typically comes to
mind. This new storm, created by a tsunami of data and flood of new
technologies, is one of potential and opportunity for those bold enough to navigate
the waters. The combination of data and technology will enable our ability to deliver
new features and products that are able to better address our customers’
information needs. The storm is simply: “Big Data”.

“Big Data” is one of the current catch phrases in
information technology circles that can be used to mean many different things
to different people. While many people assume that it refers specifically to very
large-sized datasets, it is really more encompassing than that.  We like
the following definition of “Big Data” from O’Reilly Radar, Release 2.0: Issue


Big Data: when the size and performance requirements for data management become significant design and decision factors for implementing a data management and analysis system. For some organizations, facing hundreds of gigabytes of data for the first time may trigger a need to reconsider data management options. For others, it may take tens or hundreds of terabytes before data size becomes a significant consideration.


Within publishing there are three dimensions, significant
movement along any of which, can trigger the transition of a problem or product
into the realm of “Big Data”. They are: the volume of data in question,
the complexity of calculations that have to be applied against it, and the speed
at which the processing needs to happen. These dimensions surface
themselves as follows:

Volume – The amount of data that is being authored, crawled, and
captured is increasing exponentially. This data spans the spectrum from highly
structured record formats to completely unstructured textual information and everything
in between. These data types include the tradition well-formatted publishing
assets (articles, books, databases, etc.). In addition, we would also include
usage information, customer data, crawled web pages, extracted intelligence/metadata,
etc. The list goes on and on. As more and more of this data is captured
electronically and interlinked, it becomes much more valuable. Unfortunately it
also becomes much more unwieldy to work with.

Complexity – This flood of new data quickly overwhelms the
ability of traditional software packages and paradigms to handle it. Even smaller
datasets can strain their capabilities when extensive computations need to be
performed to generate a result. Fortunately within the past few years,
new technology options such as Hadoop, HPCC Systems, NoSQL databases, etc. have
transformed the ability to perform extensive computations and rapidly process
large amounts of data. These technologies will form the backbone of the new
analytic and delivery capabilities for our data and products.

Speed – Frequently, these new tools and the amount of data being
processed require significant amounts of storage and processing power to
generate results quickly. Historically, having to purchase and host the
necessary computer capacity would have precluded any attempts to work at such large
scale. Fortunately, today new platform technologies exist that allow us to
store, manage, and process this data in cost effective and scalable ways.  Developments
such as virtualization, public and private cloud computing, commodity hardware,
high speed bandwidth, etc. provide cost effective alternatives for acquiring on
demand compute capacity and bulk storage at the necessary scale.

Publishing has numerous entrenched challenges that are ripe
for a “Big Data” solution. Here are just a few of many that jump to mind:

  • Author disambiguation: Various attempts such as Elsevier’s
    Scopus Author Id, Thomson Reuters’ ResearcherID and the new ORCID
    initiative are testament that this is not a trivial problem. The
    complexity and scale of the calculations needed to resolve the records
    make this a prime “Big Data” candidate problem.
  • Recommendation engines. Amazon-esque suggestions of
    related material, potential upsells, and even predictive page generation
    and caching are the product requirements de jour. The amount of data and
    the required response times preclude traditional solutions.
  • Deep citation analytics: Products like Elsevier’s SciVal
    Suite and Thomson Reuters’ Research Analytics are raising the bar in
    customized, on demand analytics. Again the combination of data size,
    complex algorithms, and response speed are classic hallmarks of “Big Data”.

The opportunity is here today. All that is needed to weather
the storm is a bit of vision and a good surfboard. We hope you’ll join us
for the ride. Aloha!

Curt Kohler – Director of Disruptive Technologies, Elsevier Labs

Darin McBeath – Director of Disruptive Technologies, Elsevier Labs

Categories: Big Data, Publishing Tags: ,

Back from Beyond, or: How I learned to stop worrying and love the PDF

January 25, 2011 5 comments

As I slowly wind down from what perhaps should be called the ‘Beyond the PDF Bootcamp’, I am starting to process the whirlwind of talks, demos, conversations and interactions that took and are still very much taking place. First and foremost, it was an incredible amount of fun to see old friends, and meet new ones, and see old friends make new ones, hear the multitude of passionate voices and see the hard work everyone did to show their wares and share their thoughts at this uniquely cross-disciplinary event. At dinner just before the workshop, I sat next to Michael Kurtz from Harvard, who told me something I admit I didn’t know: that 95% of all the matter in the universe is composed of dark matter and dark energy, and we don’t have a clue what either of those actually are. This concrete image of the vast amount of things we don’t offered a great intro, to me, for an exciting, thrilling, amazing three days, and an image reflecting the sheer gigantonormousness of the questions that we are all trying to help solve, sooner, rather than later.

To someone coming in from the outside it must have been pretty overwhelming to see these 50-or so scientists, developers, with the odd librarian or publisher thrown in, all individually and collectively hell-bent on ‘changing science publishing’; while continually asking each other what that means, exactly. Change, – from what? What is broken now; what needs fixing? But more importantly – change to what, for whom, and by whom? And there do not seem to be any hard and fast answers – although perhaps some slow and soft ones are starting to appear. In trying to process the workshop and its many discussions, I myself thinking about the conference as covering not so much a list of topics as a set of polar opposites. So far, I’ve found seven, which I would like to put out as a first list of axes to span a space within which we are slowly trying to orient and position ourselves. I very much invite everyone, present and not present, to add to or comment on these; this is just a first, stream-of-consciousness representation of things that struck me.

  1. Order vs. chaos.
    One of the things that I personally learned from helping Phil run this workshop was how much wonderful, useful, creative work can be done by leaning back and letting order develop out of a certain amount of chaos. In preparing for the meeting, Phil always knew quite clearly what he wanted, but he had a great down-under ‘no-worries mate’ attitude to the whole thing, while I panicked with Dutch uptight-ness about the program not being posted and the chairs not invited and how were we going to run the workshops when no one really knew what the scope or the goals were, and lots of other stuff. I admit to harassing poor Phil to no end, as he dealt with the really important stuff – getting the food and the travel and the hotels together, oh and by the way running a research group and doing great science, just on the side. So a lesson I personally learned was that overplanning is not necessary, and probably doesn’t help – as Phil put it: ‘You just get a bunch of good people in a room, and then you step aside and let them get on with it.’ That is exactly what happened, and it was wonderful to see how everyone worked and stepped up to the plate and led discussions and breakout sessions and took up chores (taking notes, posting demo’s, commenting on Twitter) without anyone planning it or telling them what to do. It seems a testimony both to the quality of the participants and to Phil’s charismatic and understated leadership that we got where we did on Day 3 – at a point far beyond anything I think any of us had hoped for, beforehand. It seems a great model for science publishing, and science as a whole, to move forward: make sure you get interested and interesting people in the same space, offer them room to communicate and a few tools and something amazing will emerge, without anyone steering or taking control or pushing things in a specific direction.
  2. Data vs. rhetoric
    A key discussion at the conference – as it should be, and it was wonderful to watch it emerge! – concerns the nature of the research object. On the first day, a fascinating discussion emerged about the concept of annotation – and how you can argue that papers themselves are (just) annotations of data. Likewise, reviewers’ comments, review articles, blog posts and even citations themselves can all be seen to be annotations on a particular paper, so we can visualize the information space as a series of concentric circles that surround the data in an increasing cycle of finding, observation, interpretation and comment. There were pleas for a ‘data paper’ by John Kunz of the California Digital Library, which ‘minimally consists of a cover sheet and a set of links to archived artifacts ’. A step up in complexity, perhaps, are nanopublications [1] – essentially, graphs of triples with provenance. One step more in the narrative chain brings one to what used to be called modular papers [2], and is now referred to as ‘a medium-grained data object‘ in the W3C Health Care and Life Sciences discussion; and finally we get to the coarse-grained, IMRaD shaped structure that we all know and love. Somewhere in between, rhetoric happens, and it was encouraging and exciting to hear the participants bandy about terms such as rhetoric, narrative and persuasion – not words you’d hear uttered by the semantic/bioinformatics community even five years ago! Although some of us old fogeys thought we’d made some perfectly useful wheels (e.g., [3] -[5]), it seems some of them will be reinvented; but having this discussion be as lively as it is, is much more important than rehashing what was thought up in the past. This is a vibrant and intelligent group of people who are apt to take up nuggets of what worked and invent parts that don’t seem useful anymore – and it is fascinating to see what will emerge as a set of research objects that can be linked, archived -and annotated.
  3. Old vs. New
    One thing that truly thrilled me was to see some of the giants in changing scientific communication, such as Peter Murray-Rust and Michael Kurtz, be in interaction and listening and being heard by the new generation, who are all-aTwitter and have grown up in social networks in a semantic, interconnected world. Many of the good ideas from twenty years ago have been realized (as Kurtz’s sketch of the astronomy information space showed, where scientists run the journals, data stores and archives and, essentially, everyone has access to everything in the way they want – they even have an alerting system that works!). But many others have not gained the impact or traction they so deserve: to me, Scientific Markup Language is a case in point, and I hope Peter Murray-Rust will reintroduce it to this group. In aiming to provide a home for these ongoing developments, I think there is a unique opportunity to connect both the old and the new, and a cross-disciplinary group of scientists to work on some problems using ideas ranging from Linked Data to triple stores to more SGML-based, old-fashioned principles. One party that I do think is still missing in this discussion, and I found to be missing at the meeting, is the Digital Repositories community, who tackle issues such as archiving and attribution, authoring and annotation, and have a lot to tell us – and perhaps some concrete needs we can help address. Hopefully this can be rectified in the future, as we establish a more solid (virtual or real) meeting space, and allow for outside contributions in the discussions that are now taking place on a daily basis.
  4. RDF vs. PDF
    Several people remarked that for a conference devoted to moving ‘Beyond the PDF’, there were a surprising number of PDFs shown! In fact, only part of the discussion focused on ‘overcoming’ this format, as the Utopia PDF viewer demonstration certainly offered one a great ‘wow!’ factor [6]. I believe Steve Pettifer of Utopia even put out the challenge for people to name 10 things that they believed PDF can not do – and build them! RDF had a great number of proponents too, and of course, ideally, everyone agreed you need both: PDF is what people prefer to consume and RDF is more to computers’ tastes. The collaboration between Utopia and the Annotation Framework developed by the Harvard group shows, for my money, the most delightful way to date of combining Harvard’s semantically solid, provenance-focused Annotation Ontology with an awesome tool – allowing you to add annotations in the comfort of your own local copy and then sharing them with the community at large. The whole system makes such sense and seems so easy to use that I don’t see how we could have all lived without it – I’d certainly love a copy on my desktop as soon as possible. Now there’s an iPad application that would make science easier – getting an intelligently annotated document with links to other documents, and being able to communicate with your co-annotators at the moment you are reading what they commented!
  5. Open vs. closed.
    My colleague Brad Allen made the astute observation that the single piece of software that dominated this conference was Twitter – yet nobody talked about it. In fact, a large and very passionate part of the discussion on the first two days of the conference concerned the plea that ‘everything we do has to be open’ – that no matter what is done in the future pertaining to new forms of science publishing, not only the content has to be freely available, but also all software that is used has to be entirely open source, which is ‘clonable’ (in the words of a Twitterer). Still, we all happily use Twitter, and for me it has considerably improved my work – I know of things I wouldn’t have otherwise known, and meet people I would otherwise not have met. In short: it works. I do not know how Twitter makes money, and frankly, I don’t care, as long as they give me a tool that makes my life more productive and pleasant. What was wonderful about Friday’s discussion, in my view, was that we were able to overcome the open/closed debate that seemed to divide the room in earlier discussions (both in the room and on Twitter). It seems clear that if there are features or entities that truly improve the way we read, write and communicate, we are okay with having pay for them. As an example, I thought Wingu elements was an intriguing and way of building an eLab-workflow tool, and the concept of people building apps that can run within or outside the platform a useful one. The business model has not fully evolved yet – but the community can give a list of needed components (interoperability, open data standards, import and export options, for starters) and perhaps these can help Wingu move to a model that is acceptable to the scientists they cater to, and still allow them to exist, as a company. Similarly, the Utopia viewer, and some of the more domain-specific authoring, indexing, and annotation tools seem to really enhance the speed and quality of information access are ones I would gladly pay for.

    As for content, on Friday Maryann Martone stated emphatically that money needs to be spent “Either putting in content or taking it out”. Michael Kurtz put some figures with these concepts: the information infrastructure for an average active astronomer costs about $20,000 per year. Of this, more than 70% goes into data archiving, and only about 4k$ into all publication costs (reading and publishing). The open/closed access discussion is clearly not settled – but as we move to explore a number of use cases, it seems at least we can start to define what the varioys components are, and can all go back and try to figure out what our role in this brave new world can be. To be sustainable, an information architecture needs to come in place that allows scientists to add value but not spend too much of their time of (grant) money in maintaining software or offering customer support, while the publishers and repositories provide services (such as archiving and large-scale, high-quality XML production) that the community as a whole agrees it is worth paying someone to do. The fact that the workshop seemed to face, and then collectively overcome the open/closed dichotomy – an old discussion, that has not always been very fruitful in the past – is great progress, indeed!

  6. Central vs. distributed
    Another dichotomy that had proponents on either side of the spectrum concerned the organizational arrangement that best allows us reach this glorious future we all seek. On the one end of the argument is the ‘thousand flowers bloom’, ‘let’s all build our own thing and see how it connects when it works’ school of thought; on the other, proponents for an infrastructure, an architecture, a framework offering a solid foundation that we can all build on (and other construction site metaphors). It seems obvious, and I think was a clear outcome of the meeting, that we should do both: so over the next few months, some people will be working on developing principles or meeting places (there was talk of a journal, and an (Invisible) College?) whereas others are going home and happily continuing to work on a Really Cool App. And as long as the RCA guys’n’gals are aware of interoperability requirements and a standards of Basic Semantic Hygiene, and as long as the frameworks and standards groups don’t lose the forest for the trees, and make sure they are still connected to things people actually need and use, this parallel development should help us all leapfrog over each other on the way to science communication paradise.

    A practical example of the ‘central vs. distributed’ dichotomy was the conference discussion that took place before, during and after BtPDF, which has been taking place on several platforms: the conference website, built on Google Sites; Twitter (which was cached and analyzed in different places); the Etherpad app, which was used to take communal notes during the breakout sessions, and the very active BtPDF (Google groups) mailing list, that occasionally contains urgent requests to please, please, post everything on the website as well. All in all, a nice model for the distributed, frantic information space that the average scientists (well – the average person!) find him or herself living in! At least the collected BeyondthePDF conversations offer a nice little corpus of distributed discussions, that perhaps some clever group of computer scientists can mine, extract, combine, and connect to represent the voices, themes and dynamic of this spirited debate.

    But, finally, a wonderful outcome of the conference as a whole, in my mind, was the idea that while we maintain the distributed nature of the discussions and developments, a bunch of the participants will collaborate and very concretely start to work on a single use case: helping Maryann Martone expediate finding a cure for Spinal Muscular Atrophy. Maryann actually has a fighting chance to help find a cure for this horrible disease (‘childhood Lou Gehrig’s disease’, as she described it), provided she doesn’t have to spend months or years first gathering and then processing the vast literature that is related to neuromuscular diseases, and its genetic origins, treatment and all other possible related elements. A number of groups at the meeting (including all publishers, Harvard, ISI, the Leiden Bioinformatics group and others) have agreed to join forces and build a ‘knowledge terrarium’ that will help connect all components of the available content without barriers of business models or technology and, most importantly, help Maryann speed up her research. From this use case, undoubtably, new standards, definitions, architectures, and thoughts about business models will emerge – but more importantly, there’s a change some kid might walk again because we all got together and did something

  7. Haves vs. have-nots
    On of the most interesting post-conference discussions I had were about the need to not just improve scientific communication, but to get more people interested in science, in the first place. In the US, a mere 16% of all college students got undergaduate degrees in science or engineering (in 2006, the latest figures available, for some reason), as opposed to 47% in China and 27% in France [7]. If we don’t collectively improve this number, there won’t be anyone who can cure cancer, figure out how the brain works, or find out what dark energy is, by the time we are all retired – and the publication process will be the least of the worries of the scientists that are left.
    On the other hand, the single most poignant image of the entire meeting to me came from Leslie Chan, who said that we know that ‘mosquitoes transfer malaria’, in fact – there is a cure for malaria, and yet 850.000 people, mostly children, die from the disease every year. Having a cure is clearly not enough. In trying to solve these and other pressing issues, there are a vast contingent of scientists in the not-so-lucky parts of the world who cannot access, and certainly cannot get published, in the mainstream journals we are trying to change. Chan’s Bioline system, an open-access and open-source platform for developing-world scientists is a valiant attempt to right some of these wrongs, but as a community it seems we should spend more of our thoughts, efforts, and resources looking at allowing access to these other groups of scientists – and perhaps, involving them can help address the dearth of scientists that we will surely face.

Well – those are my seven dimensions. I very much look forward to others adding points, debating some of the more outrageous, arrogant or incorrect claims, and in general continuing this discussion with people at, or regretfully not at, the meeting. As a community, I hope we can start to build this information space of the future as we discuss it, and very much look forward to the time when all of this will come to pass, and can be handed over to the next generation, as a matter of course. So that maybe, some day, one of them might figure out what dark energy actually is…

Anita de Waard

[1] Paul Groth, Andrew Gibson and Jan Velterop, The anatomy of a nanopublication, Information Services and Use, Volume 30, Number 1-2 / 2010, p. 51-56, http://iospress.metapress.com/content/ftkh21q50t521wm2/
[2] Joost Kircz, Modularity: the next form of scientific information presentation?, Journal of Documentation, Vol.54,no.2,March 1998,pp.210-235.
[3] de Waard, A., (2007).A Pragmatic Structure for the Research Article, in: Proceedings ICPW’07: 2nd International Conference on the Pragmatic Web, 22-23 Oct. 2007, Tilburg: NL. (Eds.) Buckingham Shum, S., Lind, M. and Weigand, H. Published in: ACM Digital Library & Open University ePrint 9275. http://elsatglabs.com/labs/anita/papers/ICPW2007_DeWaard.pdf
[4] de Waard, A., Buckingham Shum, S., Carusi, A., Park, J., Samwald, M., and Sándor, Á. (2009). Hypotheses, Evidence and Relationships: The HypER Approach for Representing Scientific Knowledge Claims, Proceedings of the Workshop on Semantic Web Applications in Scientific Discourse (SWASD 2009), co-located with the 8th International Semantic Web Conference (ISWC-2009) – http://elsatglabs.com/labs/anita/papers/Hyper290809.pdf
[5] Tudor Groza, Siegfried Handschuh, Tim Clark, Simon Buckingham Shum, Anita de Waard, A Short Survey of Discourse Representation Models Proceedings of the Semantic Web Applications in Scientific Discourse Workshop Workshop at The 8th International Semantic Web Conference (ISWC 2009), Chantilly, Virginia, USA, 2009. http://elsatglabs.com/labs/anita/papers/SWASD2009_Discourse#251E15.pdf
[6] T. K. Attwood, D. B. Kell, P. Mcdermott, J. Marsh, S. R. Pettifer, and D. Thorne. Utopia Documents: linking scholarly literature with research data. Bioinformatics, 26:i540-i546, Sep 2010
[7] Lisa W. Foderado, “An Infusion of Science Where the Arts Reign”, New York Times, January 21, 2011 http://www.nytimes.com/2011/01/22/nyregion/22science.html?_r=1&ref=science

Categories: Uncategorized

Get every new post delivered to your Inbox.