NWMatcher - Documentation

Introduction

What is NWMatcher?

NWMatcher is a powerful and flexible pattern matching library designed for efficient and expressive matching of network data. It provides a concise and intuitive syntax for defining complex matching rules against various network elements such as IP addresses, ports, protocols, and more. NWMatcher is built for speed and scalability, making it suitable for high-volume network monitoring, analysis, and security applications. It allows developers to easily create and manage sophisticated matching logic without the complexities of manually writing intricate regular expressions or conditional statements.

Key Features and Benefits

Target Audience

NWMatcher is primarily aimed at network engineers, security professionals, and software developers working on network-related projects. Its ease of use and powerful features make it beneficial for both experienced developers and those new to network data processing. Anyone needing to perform efficient and accurate pattern matching on network data will find NWMatcher valuable.

Installation and Setup

The installation process for NWMatcher depends on your chosen environment and package manager. Below are instructions for common scenarios:

Using pip (Python):

  1. Open your terminal or command prompt.
  2. Execute the following command: pip install nwmatcher

From Source:

  1. Clone the NWMatcher repository from [GitHub repository link].
  2. Navigate to the project directory in your terminal.
  3. Run python setup.py install (You may need to adjust this based on your Python environment).

Note: Ensure you have the necessary dependencies installed. The README file in the repository provides a complete list of dependencies and detailed instructions for various installation methods. After successful installation, you can import and utilize the NWMatcher library in your Python code. Examples are provided in the “Usage Examples” section of this manual.

Core Concepts

Matching Algorithms

NWMatcher employs a hybrid approach to pattern matching, combining optimized Trie structures with advanced filtering techniques for high-performance matching. The core algorithm works as follows:

  1. Trie Construction: Upon initialization with a set of patterns, NWMatcher builds a Trie data structure. This efficiently indexes the patterns, allowing for rapid prefix matching.

  2. Filtering: When a network data element (e.g., an IP address, port number) is provided for matching, NWMatcher first applies efficient filtering techniques based on basic characteristics (e.g., IP address range checks). This significantly reduces the number of patterns that need to be evaluated using the Trie.

  3. Trie Traversal: The filtered patterns are then traversed within the Trie. This process rapidly identifies matching patterns. The Trie’s structure ensures that only relevant branches are explored, further optimizing performance.

  4. Result Aggregation: Matching results are aggregated and returned to the user. This may include a list of matching patterns, scores (if weighting is applied), or other relevant metadata.

Data Structures

NWMatcher utilizes several key data structures to achieve its efficiency and flexibility:

These data structures are optimized for memory usage and performance in high-throughput scenarios.

Pattern Syntax

NWMatcher uses a flexible and expressive pattern syntax. Patterns are defined as strings that can incorporate wildcard characters and logical operators.

Example:

192.168.1.* && (TCP || UDP) matches any IP address in the 192.168.1.x range using either TCP or UDP.

Weighting and Scoring

NWMatcher supports assigning weights to patterns, allowing for prioritized matching and scoring. Each pattern can be assigned a numerical weight reflecting its relative importance or relevance. During the matching process, the scores of matching patterns are summed to produce an overall score. This allows the user to determine which patterns are most significant based on a weighted evaluation of the matched elements. The scoring mechanism can be customized to fit specific needs. For example, a higher weight could be assigned to security-related patterns.

API Reference

Constructor

NWMatcher(patterns=None, default_options=None)

Initializes a new NWMatcher instance.

match(text, pattern)

match(text, pattern) -> bool

Checks if the given text matches the provided pattern.

score(text, pattern)

score(text, pattern) -> float

Returns the score of matching text against the given pattern. This assumes that weights have been set using setWeights(). A score of 0.0 indicates no match.

getMatches()

getMatches() -> list

Returns a list of tuples, where each tuple contains a matched pattern and its associated score (if weighting is enabled). This method returns all matches found since the last call to getMatches() or since the instantiation of NWMatcher.

setWeights(weights)

setWeights(weights) -> None

Sets the weights for the patterns.

setDefaultOptions(options)

setDefaultOptions(options) -> None

Sets default options for the matcher.

Events

NWMatcher may optionally support custom events. (Implementation-specific). These events could be used to notify the user of certain actions, such as a match found or a pattern added. Refer to the specific documentation for available events and how to subscribe to them. Example: onMatchFound.

Advanced Usage

Customizing Matching Algorithms

While NWMatcher provides a highly optimized default matching algorithm, advanced users may need to customize it for specific needs. This might involve modifying the Trie structure, implementing alternative filtering strategies, or adding support for new data types. To achieve this, you will typically need to extend the core NWMatcher class or utilize its internal components. Access to internal components might be provided through protected methods or by leveraging the library’s extensibility features. The specifics of customization will depend on the library’s internal architecture and will be detailed in the advanced developer documentation or API specification. Consider contributing your custom algorithm back to the project if it has broader applicability.

Handling Complex Patterns

NWMatcher is designed to handle complex patterns efficiently, but the performance and clarity can be impacted by overly convoluted or inefficiently structured patterns. For best results:

Performance Optimization

For optimal performance, consider these strategies:

Integration with other libraries

NWMatcher can be readily integrated with other network analysis or security libraries. Examples include:

Remember to consult the documentation of other libraries to understand their APIs and best practices for integration with NWMatcher. Ensure compatibility of data formats and interfaces between the libraries.

Examples

Basic Matching

This example demonstrates basic pattern matching using NWMatcher:

from nwmatcher import NWMatcher

matcher = NWMatcher()
matcher.add_pattern("192.168.1.*")
matcher.add_pattern("10.0.0.1")

print(matcher.match("192.168.1.100"))  # Output: True
print(matcher.match("10.0.0.1"))       # Output: True
print(matcher.match("172.16.0.1"))     # Output: False

matches = matcher.getMatches()
print(matches) # Output:  A list of matched patterns (implementation specific output format)

Fuzzy Matching

(Note: If NWMatcher doesn’t inherently support fuzzy matching, this section should be adapted or removed. Fuzzy matching typically requires a separate library or algorithm.)

This example demonstrates fuzzy matching (assuming NWMatcher is extended or integrates with a fuzzy matching library):

from nwmatcher import NWMatcher # Assume NWMatcher supports fuzzy matching

matcher = NWMatcher(fuzzy_matching=True) # Enabling fuzzy matching (implementation specific)
matcher.add_pattern("example.com")

print(matcher.match("example.com"))     # Output: True
print(matcher.match("exmaple.com"))    # Output: Possibly True (depending on fuzzy matching tolerance)
print(matcher.match("example.co"))     # Output: Possibly True (depending on fuzzy matching tolerance)

Regex Matching

(Note: If NWMatcher doesn’t directly support regular expressions, this section should be removed or adapted to show how to integrate with a regular expression library.)

This example shows integration with the re module for regular expression matching (assuming NWMatcher doesn’t natively handle regex):

import re
from nwmatcher import NWMatcher

matcher = NWMatcher()

pattern = r"^\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}$" # Example IP address regex

ip_address = "192.168.1.100"

if re.match(pattern, ip_address):
    print(f"IP address {ip_address} matches the pattern")
else:
    print(f"IP address {ip_address} does not match the pattern")

Weighted Matching

This example showcases how to assign weights to patterns and retrieve weighted scores:

from nwmatcher import NWMatcher

matcher = NWMatcher()
matcher.setWeights({"192.168.1.*": 2.0, "10.0.0.1": 1.0})
matcher.add_pattern("192.168.1.*")
matcher.add_pattern("10.0.0.1")

print(matcher.score("192.168.1.100"))  # Output: 2.0
print(matcher.score("10.0.0.1"))       # Output: 1.0
print(matcher.score("172.16.0.1"))     # Output: 0.0

matches = matcher.getMatches()
print(matches) # Output:  A list of (pattern, score) tuples

Practical Applications

Remember to replace placeholder comments with actual code and adjust the examples to match the specific functionality and API of your NWMatcher implementation.

Troubleshooting

Common Errors

Debugging Techniques

Frequently Asked Questions (FAQ)

This FAQ section should be expanded based on the specific features and common issues encountered with your NWMatcher implementation. Add more questions and answers as needed to address the most common queries from users.

Contributing

We welcome contributions to NWMatcher! Whether you’re fixing bugs, adding features, or improving documentation, your help is valuable. Please follow these guidelines to ensure a smooth and efficient contribution process.

Code Style Guide

We adhere to the PEP 8 style guide for Python code. Consistency in code style is crucial for readability and maintainability. Before submitting any code changes, ensure that your code conforms to the PEP 8 guidelines. You can use tools like autopep8 or flake8 to automatically check and fix style issues.

Testing

Thorough testing is essential to ensure the quality and stability of NWMatcher. All new features and bug fixes should be accompanied by comprehensive unit tests. We use [Testing Framework Name - e.g., pytest] for testing.

Submitting Pull Requests

  1. Fork the Repository: Create a fork of the NWMatcher repository on GitHub.
  2. Create a Branch: Create a new branch for your changes. Use descriptive branch names (e.g., feature/add-fuzzy-matching, bugfix/resolve-pattern-error).
  3. Make Your Changes: Make your code changes, following the code style guide and adding comprehensive tests.
  4. Commit Your Changes: Commit your changes with clear and concise commit messages.
  5. Push Your Branch: Push your branch to your forked repository.
  6. Create a Pull Request: Create a pull request from your branch to the main branch of the original NWMatcher repository.
  7. Address Feedback: Address any feedback or suggestions from the maintainers. Be prepared to make further changes based on code reviews.

Before submitting a pull request, ensure that:

We appreciate your contributions and look forward to working with you to improve NWMatcher!