python stanford ner-杰瑞科技汇

Of course! Using Stanford's Named Entity Recognizer (NER) in Python is a common task. The process has evolved over time, with the Stanford CoreNLP server being the recommended modern approach for production systems, while the older Stanford NER tagger is simpler for basic scripts.

（图片来源网络，侵删）

I'll guide you through both methods, starting with the recommended one.

Method 1: The Modern & Recommended Way - Stanford CoreNLP Server

This method involves running the Stanford CoreNLP server as a background process and then sending text to it via a simple HTTP request. This is ideal for production applications because it's fast, scalable, and decouples the Java processing from your Python code.

Step 1: Download and Run the CoreNLP Server

Download the CoreNLP package: Go to the Stanford CoreNLP download page and download the latest version. You'll get a .zip file.
Unzip the file: Unzip the downloaded file to a convenient location, for example, C:\stanford-corenlp-4.5.5 (Windows) or /home/user/stanford-corenlp-4.5.5 (Linux/macOS).
（图片来源网络，侵删）
Download the models: The package includes models for English (and others). You need to download the specific models file. The easiest way is to run this command from your terminal inside the unzipped directory:
```
# Navigate to the CoreNLP directory
cd /path/to/stanford-corenlp-4.5.5
# Download the models (this will create a "models" folder)
java -mx4g -cp "*" edu.stanford.nlp.pipeline.StanfordCoreNLPServer -port 9000 -timeout 15000
```
This command also starts the server! The -mx4g part allocates memory to the Java process. Adjust this if you run into memory issues.
Verify the server is running: Open your web browser and go to http://localhost:9000. You should see a page with a text box and some information about the server. If you see this, the server is running correctly.

Step 2: Python Code to Query the Server

Now, you can write a simple Python script to send text to the server and get the NER results. We'll use the popular requests library.

（图片来源网络，侵删）

First, install it if you haven't already:

pip install requests

Now, here's the Python script:

import requests
import json
# The URL of the running Stanford CoreNLP server
# We add the properties for annotators and output format
url = "http://localhost:9000/?properties=%7B%22annotators%22%3A%20%22tokenize%2Cssplit%2Cner%22%2C%20%22outputFormat%22%3A%20%22json%22%7D"
# The text you want to analyze
text_to_analyze = "Barack Obama was born in Hawaii. He was the 44th President of the United States."
# Send the POST request
# The server expects the raw text in the request body
response = requests.post(url, data=text_to_analyze)
# Check if the request was successful
if response.status_code == 200:
    # The response is in JSON format
    result = response.json()
    # The result is a list of sentences
    for sentence in result['sentences']:
        # Each sentence has a list of tokens (words)
        for token in sentence['tokens']:
            # Each token has the original text, its character offset, and NER tag
            original_text = token['originalText']
            ner_tag = token['ner']
            print(f"Word: {original_text:<15} NER Tag: {ner_tag}")
else:
    print(f"Error: {response.status_code}")
    print(response.text)

Output of the script:

Word: Barack          NER Tag: PERSON
Word: Obama           NER Tag: PERSON
Word: was             NER Tag: O
Word: born            NER Tag: O
Word: in              NER Tag: O
Word: Hawaii          NER Tag: LOCATION
Word: .               NER Tag: O
Word: He              NER Tag: O
Word: was             NER Tag: O
Word: the             NER Tag: O
Word: 44th            NER Tag: O
Word: President      NER Tag: O
Word: of              NER Tag: O
Word: the             NER Tag: O
Word: United          NER Tag: LOCATION
Word: States          NER Tag: LOCATION
Word: .               NER Tag: O

Method 2: The Classic Way - Stand Stanford NER Tagger

This method involves using the stanford-corenlp Python package, which acts as a wrapper for the older, standalone stanford-ner.jar file. It's simpler for a one-off script but less flexible for production.

Step 1: Download the Stanford NER Package

Download the NER package: Go to the Stanford NER download page and download the "Full Stanford NER distribution".
Unzip the file. You will find a stanford-ner.jar file and a folder named classifiers, which contains the pre-trained model files (e.g., english.muc.7class.distsim.crf.ser.gz).

Step 2: Set Up Your Python Environment

You need two things:

The stanfordcorenlp Python library.
A Java Development Kit (JDK) installed on your system, as the library runs the Java JAR file.

Install the Python library:

pip install stanfordcorenlp

Step 3: Python Code to Use the Tagger

You need to tell the StanfordCoreNLP class where to find the stanford-ner.jar file and the model file you want to use.

from stanfordcorenlp import StanfordCoreNLP
# Path to the stanford-ner.jar file
# IMPORTANT: Change this path to where you unzipped the file
jar_path = '/path/to/stanford-ner.jar' 
# Path to the pre-trained model file
# IMPORTANT: Change this path to where you unzipped the file
model_path = '/path/to/classifiers/english.muc.7class.distsim.crf.ser.gz'
# Initialize the NLP object
# This can take a few seconds as it starts the Java process
nlp = StanfordCoreNLP(jar_path, model_path=model_path)
text_to_analyze = "Barack Obama was born in Hawaii. He was the 44th President of the United States."
# The 'ner' tag performs Named Entity Recognition
# The output is a list of (token, tag) tuples
ner_tags = nlp.ner(text_to_analyze)
# Print the results
for token, tag in ner_tags:
    print(f"Word: {token:<15} NER Tag: {tag}")
# Don't forget to close the connection when you're done
# This shuts down the Java process
nlp.close()

Output of the script: (The tags might be slightly different from the CoreNLP server, as the models can vary.)

Word: Barack          NER Tag: I-PER
Word: Obama           NER Tag: I-PER
Word: was             NER Tag: O
Word: born            NER Tag: O
Word: in              NER Tag: O
Word: Hawaii          NER Tag: I-LOC
Word: .               NER Tag: O
Word: He              NER Tag: O
Word: was             NER Tag: O
Word: the             NER Tag: O
Word: 44th            NER Tag: O
Word: President      NER Tag: O
Word: of              NER Tag: O
Word: the             NER Tag: O
Word: United          NER Tag: I-ORG
Word: States          NER Tag: I-ORG
Word: .               NER Tag: O

Comparison and Recommendation

Feature	Stanford CoreNLP Server (Method 1)	Stand Stanford NER Tagger (Method 2)
How it Works	Client-server model. Python sends HTTP requests.	Python library directly calls a Java JAR file.
Scalability	Excellent. Multiple Python scripts can connect to one server.	Poor. Each script starts its own Java process, which is resource-intensive.
Performance	Very Fast. The server is optimized for repeated requests.	Slower. Starting the JVM for each script is costly.
Ease of Use	Simple once the server is running. The code is very clean.	Simple for a single script, but requires managing paths to JAR and model files.
Flexibility	High. Can perform many NLP tasks (POS tagging, parsing, etc.) by changing the URL parameters.	Low. Primarily for NER. Other tasks require different wrappers.
Best For	Production applications, web services, and any repeated use.	Quick experiments, learning, or simple, one-off scripts.

Final Recommendation:

For any serious project, use Method 1 (Stanford CoreNLP Server). The setup is a one-time effort, and the benefits in terms of performance and scalability are immense. Use Method 2 only if you are in a hurry for a quick test and don't want to bother with running a server.

python stanford ner

Method 1: The Modern & Recommended Way - Stanford CoreNLP Server

Step 1: Download and Run the CoreNLP Server

Step 2: Python Code to Query the Server

Method 2: The Classic Way - Stand Stanford NER Tagger

Step 1: Download the Stanford NER Package

Step 2: Set Up Your Python Environment

Step 3: Python Code to Use the Tagger

Comparison and Recommendation

99ANYc3cd6

Switch连接电脑教程，具体步骤是怎样的？

哪里能免费下载Java教程视频？

PS如何制作Logo教程图标？

Excel表格教程视频，新手如何快速上手？

qt5.12.2安装教程

Java Socket编程实例具体怎么实现？

Java memcache client如何正确使用？

Java中double与float区别在哪？精度与内存如何选？

Eclipse for Java怎么用？入门到精通技巧有哪些？

AdvancedSkeleton教程怎么学？关键步骤有哪些？

Java调用WebService客户端，如何快速实现？

Photoshop CS3教程下载哪里找？

python xmlrpc server

Python thread join 如何正确使用？

ArrayList与List有何区别？

Photoshop CS3中文版教程该怎么学？

python stanford ner

Method 1: The Modern & Recommended Way - Stanford CoreNLP Server

Step 1: Download and Run the CoreNLP Server

Step 2: Python Code to Query the Server

Method 2: The Classic Way - Stand Stanford NER Tagger

Step 1: Download the Stanford NER Package

Step 2: Set Up Your Python Environment

Step 3: Python Code to Use the Tagger

Comparison and Recommendation

相关推荐

Java Socket编程实例具体怎么实现？