mirror of
https://github.com/DefectingCat/DefectingCat.github.io
synced 2025-07-15 16:51:37 +00:00
179 lines
4.1 KiB
Plaintext
179 lines
4.1 KiB
Plaintext
---
|
|
title: Setting up DocSearch for next.js
|
|
date: '2022-04-18'
|
|
tags: ['Next.js', 'JavaScript']
|
|
---
|
|
|
|
I use next.js and mdx plugin to build my blog site. It's a next.js SSG project.
|
|
|
|
Also it's a JAMStack site. So i need a extenal search engine.
|
|
|
|
The Algolia is my first choice. We can build our own Algolia front UI, or use [DocSearch](https://github.com/algolia/docsearch)
|
|
|
|
## Purpose
|
|
|
|
Algolia split DocSearch into to parts:
|
|
|
|
- A cralwer to crawl our sites.
|
|
|
|
- A frontend UI liburary to show search result.
|
|
|
|
In legacy edition, Algolia provide a docsearch-scraper to build our own crawler.
|
|
|
|
Although it's still can plug it to DocSearch v3. But now it's deprecated.
|
|
|
|
They introduct the [Algolia Crawler web interface](https://crawler.algolia.com/admin/users/login) to manage the crawler.
|
|
|
|
But i can't login with my Algolia account.
|
|
|
|
<Image
|
|
src={
|
|
'/images/p/setting-up-docsearch-for-nextjs/cannot-login-to-algolia-crawler.png'
|
|
}
|
|
alt="Can't login to Algolia Crawler"
|
|
width="586"
|
|
height="131"
|
|
/>
|
|
|
|
So i need find another way to generate my post index.
|
|
|
|
## Index format
|
|
|
|
The DocSearch frontend UI read result as specific format. We just need to provide the same format to DocSearch.
|
|
|
|
Then DocSearch fronted UI can works.
|
|
|
|
<Image
|
|
src={'/images/p/setting-up-docsearch-for-nextjs/index-format.png'}
|
|
alt="Index format"
|
|
width="516"
|
|
height="197"
|
|
/>
|
|
|
|
So we need post same format to Algolia.
|
|
|
|
## Push our data
|
|
|
|
Algolia provide JavaScript API Client to push data to Algolia.
|
|
|
|
<Tab defaultValue="yarn">
|
|
<TabItem label="yarn" value="yarn">
|
|
<pre>yarn add algoliasearch</pre>
|
|
</TabItem>
|
|
<TabItem label="npm" value="npm">
|
|
<pre>npm install algoliasearch</pre>
|
|
</TabItem>
|
|
</Tab>
|
|
|
|
The client will help us push data to Algolia. We just need to prepare out data.
|
|
|
|
### Docsearch format
|
|
|
|
Because Docsearch read result as specific format. our data need to be like this:
|
|
|
|
```js
|
|
{
|
|
content: null,
|
|
hierarchy: {
|
|
lvl0: 'Post',
|
|
lvl1: slug,
|
|
lvl2: heading,
|
|
},
|
|
type: 'lvl2',
|
|
objectID: 'id',
|
|
url: 'url',
|
|
}
|
|
```
|
|
|
|
### Generate format
|
|
|
|
For generate our data, we need:
|
|
|
|
- [dotenv](https://www.npmjs.com/package/dotenv): read Algolia app ID and admin key in `.env` file.
|
|
- [algoliasearch](https://www.npmjs.com/package/algoliasearch): JavaScript API client.
|
|
- `fs` and `path`: read post file.
|
|
- [nanoid](https://www.npmjs.com/package/nanoid) (optional): generate unique `objectID`.
|
|
|
|
For use ECMAScript `import`, we need set file suffix with `.mjs`. The node.js can use `import` statement.
|
|
|
|
```js
|
|
// build-search.mjs
|
|
|
|
import { config } from 'dotenv';
|
|
import algoliasearch from 'algoliasearch/lite.js';
|
|
import fs from 'fs';
|
|
import path from 'path';
|
|
import { nanoid } from 'nanoid';
|
|
```
|
|
|
|
Next, read post content from file. First we need read whole content from the file:
|
|
|
|
```js
|
|
const files = fs.readdirSync(path.join('pages/p'));
|
|
```
|
|
|
|
Then, prepare a empty array to store post data. And traverse content to generate format we need.
|
|
|
|
```js
|
|
const myPosts = [];
|
|
files.map((f) => {
|
|
const content = fs.readFileSync(path.join('pages/p', f), 'utf-8');
|
|
// const { content: meta, content } = matter(markdownWithMeta);
|
|
|
|
const slug = f.replace(/\.mdx$/, '');
|
|
const regex = /^#{2}(?!#)(.*)/gm;
|
|
|
|
content.match(regex)?.map((h) => {
|
|
const heading = h.substring(3);
|
|
|
|
myPosts.push({
|
|
content: null,
|
|
hierarchy: {
|
|
lvl0: 'Post',
|
|
lvl1: slug,
|
|
lvl2: heading,
|
|
},
|
|
type: 'lvl2',
|
|
objectID: `${nanoid()}-https://rua.plus/p/${slug}`,
|
|
url: `https://rua.plus/p/${slug}#${heading
|
|
.toLocaleLowerCase()
|
|
.replace(/ /g, '-')}`,
|
|
});
|
|
});
|
|
```
|
|
|
|
The `type` property means level of table of contents.
|
|
|
|
I just need h2 title in search result. So just match them with `/^#{2}(?!#)(.*)/gm`.
|
|
|
|
And post title is the `lvl1` type:
|
|
|
|
```js
|
|
myPosts.push({
|
|
content: null,
|
|
hierarchy: {
|
|
lvl0: 'Post',
|
|
lvl1: slug,
|
|
},
|
|
type: 'lvl1',
|
|
objectID: `${nanoid()}-https://rua.plus/p/${slug}`,
|
|
url: `https://rua.plus/p/${slug}`,
|
|
});
|
|
```
|
|
|
|
### Push to Algolia
|
|
|
|
Algolia API is easy to use. First we need specify the index name.
|
|
|
|
```js
|
|
const index = client.initIndex('rua');
|
|
```
|
|
|
|
And save the objects.
|
|
|
|
```js
|
|
const algoliaResponse = await index.replaceAllObjects(posts);
|
|
```
|
|
|
|
All done!
|