Files
DefectingCat.github.io/pages/p/setting-up-docsearch-for-nextjs.mdx
DefectingCat 39c5711e75 Rename folder
2022-08-11 16:42:13 +08:00

185 lines
4.7 KiB
Plaintext

---
title: Setting up DocSearch for next.js
date: '2022-04-18'
tags: ['Next.js', 'JavaScript']
---
import Layout from 'layouts/MDXLayout';
import dynamic from 'next/dynamic';
import Image from 'components/mdx/Image';
import image1 from 'assets/images/p/setting-up-docsearch-for-nextjs/cannot-login-to-algolia-crawler.png';
import image2 from 'assets/images/p/setting-up-docsearch-for-nextjs/index-format.png';
export const RUASandpack = dynamic(() => import('components/RUA/RUASandpack'));
export const Tab = dynamic(() => import('components/RUA/tab'));
export const TabItem = dynamic(() => import('components/RUA/tab/TabItem'));
export const meta = {
title: 'Setting up DocSearch for next.js',
date: '2022-04-18',
tags: ['Next.js', 'JavaScript'],
};
export default ({ children }) => <Layout {...meta}>{children}</Layout>;
I use next.js and mdx plugin to build my blog site. It's a next.js SSG project.
Also it's a JAMStack site. So i need a extenal search engine.
The Algolia is my first choice. We can build our own Algolia front UI, or use [DocSearch](https://github.com/algolia/docsearch)
## Purpose
Algolia split DocSearch into to parts:
- A cralwer to crawl our sites.
- A frontend UI liburary to show search result.
In legacy edition, Algolia provide a docsearch-scraper to build our own crawler.
Although it's still can plug it to DocSearch v3. But now it's deprecated.
They introduct the [Algolia Crawler web interface](https://crawler.algolia.com/admin/users/login) to manage the crawler.
But i can't login with my Algolia account.
<Image src={image1} alt="Can't login to Algolia Crawler" />
So i need find another way to generate my post index.
## Index format
The DocSearch frontend UI read result as specific format. We just need to provide the same format to DocSearch.
Then DocSearch fronted UI can works.
<Image src={image2} alt="Index format" />
So we need post same format to Algolia.
## Push our data
Algolia provide JavaScript API Client to push data to Algolia.
<Tab defaultValue="yarn">
<TabItem label="yarn" value="yarn">
<pre>yarn add algoliasearch</pre>
</TabItem>
<TabItem label="npm" value="npm">
<pre>npm install algoliasearch</pre>
</TabItem>
</Tab>
The client will help us push data to Algolia. We just need to prepare out data.
### Docsearch format
Because Docsearch read result as specific format. our data need to be like this:
```js
{
content: null,
hierarchy: {
lvl0: 'Post',
lvl1: slug,
lvl2: heading,
},
type: 'lvl2',
objectID: 'id',
url: 'url',
}
```
### Generate format
For generate our data, we need:
- [dotenv](https://www.npmjs.com/package/dotenv): read Algolia app ID and admin key in `.env` file.
- [algoliasearch](https://www.npmjs.com/package/algoliasearch): JavaScript API client.
- `fs` and `path`: read post file.
- [nanoid](https://www.npmjs.com/package/nanoid) (optional): generate unique `objectID`.
For use ECMAScript `import`, we need set file suffix with `.mjs`. The node.js can use `import` statement.
```js
// build-search.mjs
import { config } from 'dotenv';
import algoliasearch from 'algoliasearch/lite.js';
import fs from 'fs';
import path from 'path';
import { nanoid } from 'nanoid';
```
Next, read post content from file. First we need read whole content from the file:
```js
const files = fs.readdirSync(path.join('pages/p'));
```
Then, prepare a empty array to store post data. And traverse content to generate format we need.
```js
const myPosts = [];
files.map((f) => {
const content = fs.readFileSync(path.join('pages/p', f), 'utf-8');
// const { data: meta, content } = matter(markdownWithMeta);
const slug = f.replace(/\.mdx$/, '');
const regex = /^#{2}(?!#)(.*)/gm;
content.match(regex)?.map((h) => {
const heading = h.substring(3);
myPosts.push({
content: null,
hierarchy: {
lvl0: 'Post',
lvl1: slug,
lvl2: heading,
},
type: 'lvl2',
objectID: `${nanoid()}-https://rua.plus/p/${slug}`,
url: `https://rua.plus/p/${slug}#${heading
.toLocaleLowerCase()
.replace(/ /g, '-')}`,
});
});
```
The `type` property means level of table of contents.
I just need h2 title in search result. So just match them with `/^#{2}(?!#)(.*)/gm`.
And post title is the `lvl1` type:
```js
myPosts.push({
content: null,
hierarchy: {
lvl0: 'Post',
lvl1: slug,
},
type: 'lvl1',
objectID: `${nanoid()}-https://rua.plus/p/${slug}`,
url: `https://rua.plus/p/${slug}`,
});
```
### Push to Algolia
Algolia API is easy to use. First we need specify the index name.
```js
const index = client.initIndex('RUA');
```
And save the objects.
```js
const algoliaResponse = await index.replaceAllObjects(posts);
```
All done!