杰瑞科技汇

Python NNTP newnews命令如何高效获取新文章?

Of course! Using Python to interact with an NNTP server to get NEWNEWS is a classic task. The NEWNEWS command is used to get a list of article IDs that have been posted since a specific date and time.

Python NNTP newnews命令如何高效获取新文章?-图1
(图片来源网络,侵删)

Here’s a complete guide, including the theory, code examples, and explanations.

What is NNTP and NEWNEWS?

  • NNTP (Network News Transfer Protocol): The protocol used for Usenet, which is a distributed discussion system. It's like a giant, global, decentralized forum. Servers store articles (posts) in newsgroups (forums).
  • NEWNEWS command: An NNTP command that asks the server: "Show me all the new article IDs in these specific newsgroups since this date and time." It's very useful for tracking changes without having to download all headers.

Python's Standard Library: nntp

Python has a built-in nntp module in the standard library, which is perfect for this. It handles the connection, sending commands, and parsing the server's responses.

First, let's ensure you have a server to test against. A great public, read-only server is news.aioe.org. It doesn't require authentication for reading commands like NEWNEWS.


Example 1: Basic NEWNEWS Usage

This script will connect to the server, ask for all new articles in the comp.lang.python group since yesterday, and print the article IDs.

Python NNTP newnews命令如何高效获取新文章?-图2
(图片来源网络,侵删)
import nntp
import datetime
def get_new_news(server_address, newsgroups, since_date):
    """
    Connects to an NNTP server and retrieves new articles using the NEWNEWS command.
    Args:
        server_address (str): The address of the NNTP server (e.g., 'news.aioe.org').
        newsgroups (list or str): A list of newsgroup names or a single name.
        since_date (datetime.datetime): The date and time to look back from.
    """
    # The nntp library expects the date in a specific format: "YYYYMMDD HHMMSS"
    date_str = since_date.strftime("%Y%m%d %H%M%S")
    # If a single newsgroup is passed, make it a list for consistency
    if isinstance(newsgroups, str):
        newsgroups = [newsgroups]
    print(f"Connecting to {server_address}...")
    try:
        # The context manager (with statement) handles connecting and disconnecting
        with nntp.NNTP(server_address) as n:
            print(f"Connected. Server greeting: {n.getwelcome().decode('utf-8')}")
            # The NEWNEWS command
            # response: A tuple (resp_number, response_lines, is_error)
            # response_lines: A list of article IDs (strings)
            response, article_ids = n.newnews(newsgroups, date_str)
            print(f"\nNEWNEWS command response: {response.decode('utf-8')}")
            print(f"Found {len(article_ids)} new articles.")
            if article_ids:
                print("\nFirst 10 new article IDs:")
                for i, article_id in enumerate(article_ids[:10]):
                    print(f"  {i+1}. {article_id}")
            else:
                print("No new articles found in the specified groups since the given date.")
    except nntp.NNTPError as e:
        print(f"An NNTP error occurred: {e}")
    except ConnectionRefusedError:
        print(f"Connection refused. Is the server '{server_address}' correct and reachable?")
    except Exception as e:
        print(f"An unexpected error occurred: {e}")
# --- Main execution ---
if __name__ == "__main__":
    # Use a public NNTP server
    SERVER = 'news.aioe.org'
    # Define the newsgroup(s) to query
    NEWSGROUPS = ['comp.lang.python']
    # Define the time to look back from. Let's get news from the last 24 hours.
    # The NEWNEWS command is relative to the server's time, but we use our local time as a reference.
    SINCE_DATE = datetime.datetime.now() - datetime.timedelta(days=1)
    get_new_news(SERVER, NEWSGROUPS, SINCE_DATE)

How to Run It:

  1. Save the code as a Python file (e.g., nntp_newnews.py).
  2. Run it from your terminal: python nntp_newnews.py

Expected Output:

Connecting to news.aioe.org...
Connected. Server greeting: 200 aioe.org NNTP service ready, posting ok
NEWNEWS command response: 230 list of new articles follows
Found 42 new articles.
First 10 new article IDs:
  1. <a1b2c3d4$1@news.aioe.org>
  2. <e5f6g7h8$2@news.aioe.org>
  3. <i9j0k1l2$3@news.aioe.org>
  ...

Example 2: Getting Article Headers (Going Further)

Just having an article ID isn't very useful. You'll usually want to get the article's header (subject, author, date, etc.) or the full body. You can do this with the n.head() or n.body() methods.

Here's an extended example that gets the subject of the first new article.

import nntp
import datetime
def get_new_news_with_headers(server_address, newsgroups, since_date, limit=5):
    """
    Gets new articles and then fetches the headers for the first 'limit' articles.
    """
    date_str = since_date.strftime("%Y%m%d %H%MSS")
    if isinstance(newsgroups, str):
        newsgroups = [newsgroups]
    try:
        with nntp.NNTP(server_address) as n:
            print(f"Connecting to {server_address}...")
            print(f"Getting new news in {newsgroups} since {date_str}...")
            _, article_ids = n.newnews(newsgroups, date_str)
            if not article_ids:
                print("No new articles found.")
                return
            print(f"\nFound {len(article_ids)} new articles. Fetching headers for the first {limit}...\n")
            # Fetch headers for the first 'limit' articles
            for i, article_id in enumerate(article_ids[:limit]):
                try:
                    # n.head() returns (resp_num, lines, is_error)
                    # 'lines' is a list of header lines, e.g., ['Subject: ...', 'From: ...']
                    resp, header_lines = n.head(article_id)
                    # Find the subject line
                    subject = "No Subject"
                    for line in header_lines:
                        if line.lower().startswith("subject:"):
                            subject = line.split(":", 1)[1].strip()
                            break
                    print(f"--- Article {i+1} ---")
                    print(f"ID: {article_id}")
                    print(f"Subject: {subject}")
                    print("-" * 20)
                except nntp.NNTPError as e:
                    print(f"Could not fetch header for {article_id}: {e}")
    except nntp.NNTPError as e:
        print(f"An NNTP error occurred: {e}")
# --- Main execution ---
if __name__ == "__main__":
    SERVER = 'news.aioe.org'
    NEWSGROUPS = ['comp.lang.python']
    SINCE_DATE = datetime.datetime.now() - datetime.timedelta(hours=6) # Last 6 hours
    get_new_news_with_headers(SERVER, NEWSGROUPS, SINCE_DATE, limit=3)

Expected Output:

Connecting to news.aioe.org...
Getting new news in ['comp.lang.python'] since 20251027 140000...
Found 15 new articles. Fetching headers for the first 5...
--- Article 1 ---
ID: <abc123xyz@news.aioe.org>
Subject: Re: Best way to parse a CSV file in Python?
--------------------
--- Article 2 ---
ID: <def456uvw@news.aioe.org>
Subject: Python 3.13: What's new for data science?
--------------------
--- Article 3 ---
ID: <ghi789rst@news.aioe.org>
Subject: Help with threading and GIL
--------------------

Handling Authentication

Some NNTP servers (like your ISP's or a private one) require authentication. The nntp module handles this with the user and pass commands. You can pass credentials when creating the NNTP object.

# Example with authentication
SERVER = 'your.nntp.server.com'
USER = 'your_username'
PASSWORD = 'your_password'
try:
    # The nntp.NNTP constructor can take user and pass
    # It will automatically send the credentials after connecting
    with nntp.NNTP(SERVER, user=USER, password=PASSWORD) as n:
        print("Authenticated successfully!")
        # ... rest of your NEWNEWS code here ...
        _, article_ids = n.newnews(['alt.test'], '20251027 000000')
        print(f"Found {len(article_ids)} new articles in alt.test.")
except nntp.error_perm as e:
    # error_perm is raised for authentication failures (502/503 errors)
    print(f"Authentication failed: {e}")
except nntp.NNTPError as e:
    print(f"An NNTP error occurred: {e}")

Summary of Key nntp Methods

Method Description Example
nntp.NNTP(host, user=None, password=None) Constructor. Connects to the server. Optionally authenticates. with nntp.NNTP('news.aioe.org') as n:
n.newnews(groups, date) Core command. Gets new article IDs since date. _, ids = n.newnews(['comp.lang.python'], '20251027 120000')
n.head(article_id) Gets the headers of a specific article. _, headers = n.head('<article-id>')
n.body(article_id) Gets the body of a specific article. _, body_lines = n.body('<article-id>')
n.group(group_name) Selects a newsgroup and gets info (last, first, count). resp, count, first, last, name = n.group('comp.lang.python')
n.quit() Closes the connection gracefully. Automatically called by the with statement.
Python NNTP newnews命令如何高效获取新文章?-图3
(图片来源网络,侵删)
分享:
扫描分享到社交APP
上一篇
下一篇