views:

553

answers:

6

I am wondering about what kind of copyright issues should one be aware of, when developing a dictionary application.

There are plenty of sources where you can "collect" words from internet. But what precautions should be used before using them?

Lets say, there are plenty of sources where you can get wordlists without any licensing strings attached (atleast not explicitly). What happens if one collects such lists [just by googling and not hacking existing software or pirating] and develop a commercial dictionary application?

Cheers

+3  A: 

Dictionaries are, by default, subject to copyright just like any other work. However, you could use the wiktionary database which is available under the GNU Free Documentation License.

Greg Hewgill
+1  A: 
  1. Most stuff on the internet is actually not meant for copying, even when reachable via google.

  2. Most jurisdictions recognize databases as copyrightable (or equivalent) on their own.

Conclusion: copying word lists and definitions (and most other stuff) from the internet without respecting the source's licenses is a good way to get sued.

David Schmitt
+3  A: 

If you want a full dictionary WordNet has a C API and interfaces to various other languages and can be used in commercial applications without royalty payments.

Alternatively, the Wordlist project on SourceForge has several word lists available for download. Several (possibly all) of these are public domain and require only attribution if you use them. These would be fine for a spell checker or crossword solver etc. but not for an actual dictionary with definitions.

Dave Webb
+5  A: 

You should probably ask a copyright lawyer, and not a bunch of geeks.

Dominic Eidson
True, but you never know how the geeks can surprise you :)
Prakash
Software developers should have at least a basic understanding of patent and copyright law - enough to answer a basic question. I'm trying to learn about said topics, but it is indeed complex.
Thomas Owens
I voted you up. Maybe edit out the self-deprecation? :-)
Jason Cohen
+1  A: 

It is going to depend upon if you are just compiling a list of words, or if you are compiling a list of words and definitions.

A simple list of words, with no definitions, is not copyrightable and the Feist Publications v. Rural Telephone Service is the court case that tends to be referenced a lot in situations like this.

If you are copying the definitions as well then you are starting to get into a legal gray area where a lawyer would likely be able to tell you one way or another. You might be able to compile such a list of definitions if you pay very close attention to where the material is coming from; for example some of the Creative Commons would allow you to do this.

Rob
+1  A: 

This of course varies in each jurisdiction and you need to check with a lawyer (I ain't one), yet I recall reading once that word lists fall in the same legal categories of maps: the particular presentation of them is what gets copyrighted.

A word list by itself isn't copyrightable, the explanations, pronunciation guides, examples of usage and, of course, the editorial format of the book itself are indeed.

So unless you want to produce your own explanations for most of the words (of course, you can copy them verbatim from old dictionaries whose copyright term has expired - and how long these terms are also depends on the jurisdiction) you would have to check the legalese or be prepared to negotiate with the IP owners...

Joe Pineda