Files
DefectingCat.github.io/content/posts/setting-up-docsearch-for-nextjs.mdx
2023-01-29 09:48:49 +08:00

179 lines
4.1 KiB
Plaintext

---
title: Setting up DocSearch for next.js
date: '2022-04-18'
tags: ['Next.js', 'JavaScript']
---
I use next.js and mdx plugin to build my blog site. It's a next.js SSG project.
Also it's a JAMStack site. So i need a extenal search engine.
The Algolia is my first choice. We can build our own Algolia front UI, or use [DocSearch](https://github.com/algolia/docsearch)
## Purpose
Algolia split DocSearch into to parts:
- A cralwer to crawl our sites.
- A frontend UI liburary to show search result.
In legacy edition, Algolia provide a docsearch-scraper to build our own crawler.
Although it's still can plug it to DocSearch v3. But now it's deprecated.
They introduct the [Algolia Crawler web interface](https://crawler.algolia.com/admin/users/login) to manage the crawler.
But i can't login with my Algolia account.
<Image
src={
'/images/p/setting-up-docsearch-for-nextjs/cannot-login-to-algolia-crawler.png'
}
alt="Can't login to Algolia Crawler"
width="586"
height="131"
/>
So i need find another way to generate my post index.
## Index format
The DocSearch frontend UI read result as specific format. We just need to provide the same format to DocSearch.
Then DocSearch fronted UI can works.
<Image
src={'/images/p/setting-up-docsearch-for-nextjs/index-format.png'}
alt="Index format"
width="516"
height="197"
/>
So we need post same format to Algolia.
## Push our data
Algolia provide JavaScript API Client to push data to Algolia.
<Tab defaultValue="yarn">
<TabItem label="yarn" value="yarn">
<pre>yarn add algoliasearch</pre>
</TabItem>
<TabItem label="npm" value="npm">
<pre>npm install algoliasearch</pre>
</TabItem>
</Tab>
The client will help us push data to Algolia. We just need to prepare out data.
### Docsearch format
Because Docsearch read result as specific format. our data need to be like this:
```js
{
content: null,
hierarchy: {
lvl0: 'Post',
lvl1: slug,
lvl2: heading,
},
type: 'lvl2',
objectID: 'id',
url: 'url',
}
```
### Generate format
For generate our data, we need:
- [dotenv](https://www.npmjs.com/package/dotenv): read Algolia app ID and admin key in `.env` file.
- [algoliasearch](https://www.npmjs.com/package/algoliasearch): JavaScript API client.
- `fs` and `path`: read post file.
- [nanoid](https://www.npmjs.com/package/nanoid) (optional): generate unique `objectID`.
For use ECMAScript `import`, we need set file suffix with `.mjs`. The node.js can use `import` statement.
```js
// build-search.mjs
import { config } from 'dotenv';
import algoliasearch from 'algoliasearch/lite.js';
import fs from 'fs';
import path from 'path';
import { nanoid } from 'nanoid';
```
Next, read post content from file. First we need read whole content from the file:
```js
const files = fs.readdirSync(path.join('pages/p'));
```
Then, prepare a empty array to store post data. And traverse content to generate format we need.
```js
const myPosts = [];
files.map((f) => {
const content = fs.readFileSync(path.join('pages/p', f), 'utf-8');
// const { content: meta, content } = matter(markdownWithMeta);
const slug = f.replace(/\.mdx$/, '');
const regex = /^#{2}(?!#)(.*)/gm;
content.match(regex)?.map((h) => {
const heading = h.substring(3);
myPosts.push({
content: null,
hierarchy: {
lvl0: 'Post',
lvl1: slug,
lvl2: heading,
},
type: 'lvl2',
objectID: `${nanoid()}-https://rua.plus/p/${slug}`,
url: `https://rua.plus/p/${slug}#${heading
.toLocaleLowerCase()
.replace(/ /g, '-')}`,
});
});
```
The `type` property means level of table of contents.
I just need h2 title in search result. So just match them with `/^#{2}(?!#)(.*)/gm`.
And post title is the `lvl1` type:
```js
myPosts.push({
content: null,
hierarchy: {
lvl0: 'Post',
lvl1: slug,
},
type: 'lvl1',
objectID: `${nanoid()}-https://rua.plus/p/${slug}`,
url: `https://rua.plus/p/${slug}`,
});
```
### Push to Algolia
Algolia API is easy to use. First we need specify the index name.
```js
const index = client.initIndex('rua');
```
And save the objects.
```js
const algoliaResponse = await index.replaceAllObjects(posts);
```
All done!