You can either mint your own URI as discussed above, or use a blank-node. There are pros and cons for both approaches:
URI's have an external identity, so you can explicitly refer to your concept in future queries which can make some queries much simpler; but, you they have an external identity, so the algorithm you use to construct the URI's becomes a critical part of your infrastructure and you have to guarantee they are both stable and unique. This may be trivial at first, but when you start dealing with multiple documents being reprocessed at differing times, often in parallel, and on distributed systems, it pretty quickly ceases to be straight forward.
Blank-nodes were included specifically to solve this problem, their uniqueness is guaranteed by their scoping; but, if you are going to need to refer to a blank-node in a query explicitly you are going to need to use either a non-standard extension, or find some way to characterize the node.
In both cases, but especially should you use a blank-node, you should include provenance statements to characterize it anyway.
@nathan's example is a good one to get the idea.
So an example using blank-nodes might be:
@prefix my: <http://yourdomain.com/2010/07/20/conceptmap#> .
@prefix proc: <http://yourdomain.com/2010/07/20/processing#> .
@prefix prg: <http://yourdomain.com/processors#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix xsd: <http://www.example.org/> .
@prefix dcterms: <http://purl.org/dc/terms/> .
@prefix foaf: <http://xmlns.com/foaf/0.1/> .
@prefix doc: <http://yourdomain.com/doc-path/> .
_:1 rdf:type proc:ProcessRun ;
proc:parser prg:tagger ;
proc:version "1.0.2" ;
proc:time "2010-07-03 20:35:45"^^<xsd:Timestamp> ;
proc:host prg:hostname-of-processing-node ;
proc:file doc:some-doc#line=1,;md5=md5_sum_goes_here,mime-charset_goes_here ;
_:2 rdf:type foaf:Person ;
foaf:name "John Smith"@en ;
proc:identifiedBy _:1 ;
proc:atLocation doc:some-doc#char=0,9 .
_:3 rdf:type owl:Thing ;
foaf:name "Washington"@en ;
proc:identifiedBy _:1 ;
proc:atLocation doc:some-doc#char=24,33 .
<http://yourdomain.com/some-doc#this> rdf:type foaf:Document ;
dcterms:references _:2, _:3 .
Note the use of rfc5147 text/plain fragment identifiers to uniquely identify the file being processed, this provides you with flexibility as to how you wish to identify individual runs. The alternative is to capture all this in the URI for the document root, or to abandon provenance altogether.
@prefix : <http://yourdomain.com/ProcessRun/parser=tagger/version=1.0.2/time=2010-07-03+20:35:45/host=hostname-of-processing-node/file=http%3A%2F%2Fyourdomain.com%2Fdoc-path%2Fsome-doc%23line%3D1%2C%3Bmd5%3Dmd5_sum_goes_here%2Cmime-charset_goes_here/$gt; .
@prefix my: <http://yourdomain.com/2010/07/20/conceptmap#> .
@prefix proc: <http://yourdomain.com/2010/07/20/processing#> .
@prefix prg: <http://yourdomain.com/processors#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix xsd: <http://www.example.org/> .
@prefix dcterms: <http://purl.org/dc/terms/> .
@prefix foaf: <http://xmlns.com/foaf/0.1/> .
@prefix doc: <http://yourdomain.com/doc-path/some-doc#> .
:1 rdf:type proc:ProcessRun ;
proc:parser prg:tagger ;
proc:version "1.0.2" ;
proc:time "2010-07-03 20:35:45"^^<xsd:Timestamp> ;
proc:host prg:hostname-of-processing-node ;
proc:file doc:some-doc#line=1,;md5=md5_sum_goes_here,mime-charset_goes_here ;
:2 rdf:type foaf:Person ;
foaf:name "John Smith"@en ;
proc:identifiedBy :1 ;
proc:atLocation doc:some-doc#char=0,9 .
:3 rdf:type owl:Thing ;
foaf:name "Washington"@en ;
proc:identifiedBy :1 ;
proc:atLocation doc:some-doc#char=24,33 .
<http://yourdomain.com/some-doc#this> rdf:type foaf:Document ;
dcterms:references :2, :3 .
You will note that foaf:name has a range of owl:Thing, so it can be applied to anything. An alternative might to use skos:Concept and rdfs:label for the proper nouns.
One final consideration for blank-node vs. URI is that any datastore you use will ultimately have to store any URI you use, and this can have implications regarding performance if you are using very large datasets.
Ultimately if I was going to publish the provenance information in the graph along with the final unified entities, I would be inclined to go with blank-nodes and allocate URI's to the concepts I ultimately unify entities with.
If however I am not going to be tracking the provenance of the inferences, and this is just one pass of many in a pipeline which will ultimately discard the intermediate results, I would just mint URIs using some sort of document hash, timestamp, and id and be done with it.
@prefix : <http://yourdomain.com/entities#> .
@prefix my: <http://yourdomain.com/2010/07/20/conceptmap#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix dcterms: <http://purl.org/dc/terms/> .
@prefix foaf: <http://xmlns.com/foaf/0.1/> .
:filename_timestamp_1 rdf:type foaf:Person ;
foaf:name "John Smith"@en .
:filename_timestamp_2 rdf:type owl:Thing ;
foaf:name "Washington"@en .
<http://yourdomain.com/some-doc#this> rdf:type foaf:Document ;
dcterms:references :2, :3 .