How Ni1 Respects robots.txt

At Ni1, respect for website owners is a core principle. Our crawler follows established web standards and honors the instructions provided through the robots.txt protocol. We believe website owners should have clear control over how their content is accessed and indexed.

What Is robots.txt?

A robots.txt file is a simple text file placed in the root directory of a website. It tells web crawlers which areas of a site they are allowed or not allowed to access.

For example:

User-agent: *
Disallow: /private/

This rule tells all crawlers not to access the /private/ directory.

A robots.txt file is typically located at:

https://example.com/robots.txt

How Ni1 Uses robots.txt

Before crawling a website, Ni1 checks the site’s robots.txt file and follows the rules specified for our crawler.

Our crawler:

Reads robots.txt before crawling.
Respects crawl restrictions.
Avoids blocked directories and files.
Operates responsibly to minimize server load.
Focuses on publicly accessible content only.

Allowing Ni1 to Crawl Your Website

If you would like Ni1 to crawl and index your website, you can explicitly allow our crawler in your robots.txt file.

Example:

User-agent: Ni1Bot
Allow: /

This allows Ni1 to access all publicly available pages on your website.

You may also allow all search engine crawlers:

User-agent: *
Allow: /

Blocking Ni1 From Specific Areas

You can restrict access to selected directories.

Example:

User-agent: Ni1Bot
Disallow: /admin/
Disallow: /private/

Ni1 will avoid these locations.

Blocking Ni1 Completely

If you do not want Ni1 to crawl your website, use:

User-agent: Ni1Bot
Disallow: /

Ni1 will respect this directive and will not crawl your site.

Managing Your robots.txt File

Creating and updating a robots.txt file is straightforward:

Step 1: Create the File

Create a plain text file named:

robots.txt

Step 2: Add Your Rules

Specify which crawlers may access which areas of your website.

Example:

User-agent: *
Disallow: /temp/
Disallow: /backup/

User-agent: Ni1Bot
Allow: /

Step 3: Upload to Your Website Root

Place the file in your site’s root directory:

https://yourdomain.com/robots.txt

Step 4: Verify Accessibility

Ensure visitors can access the file directly in their browser.

Example robots.txt Configurations

Allow Everything

User-agent: *
Allow: /

Block a Private Folder

User-agent: *
Disallow: /private/

Block Multiple Directories

User-agent: *
Disallow: /admin/
Disallow: /internal/
Disallow: /backup/

Allow Ni1 but Block Others

User-agent: Ni11.0
Allow: /

User-agent: *
Disallow: /

Best Practices

Only block content that should not be crawled.
Keep robots.txt simple and easy to maintain.
Review your rules whenever your website structure changes.
Use robots.txt alongside other security measures where appropriate.
Remember that robots.txt manages crawler access; it is not a security mechanism for protecting sensitive data.

Our Commitment

Ni1 is built on transparency, privacy, and respect for the open web. We honor robots.txt directives, focus exclusively on publicly accessible content, and give website owners clear control over how their sites interact with our crawler.

If you have questions about Ni1Bot or need assistance managing crawler access, our team is always happy to help.