Automatisation de traitement de fichiers avec n8n : extraction et génération
- Ce workflow n8n a pour objectif d'automatiser le traitement de fichiers en extrayant des données et en générant des documents. Dans un contexte où les entreprises doivent régulièrement manipuler des fichiers variés, ce workflow s'avère particulièrement utile pour les équipes de gestion de documents, d'analytique ou de développement de contenu. Grâce à cette automatisation n8n, les utilisateurs peuvent facilement intégrer des fichiers locaux, les traiter et générer des résultats exploitables sans intervention manuelle.
- Le workflow commence par un déclencheur de type 'Local File Trigger', qui surveille un répertoire spécifique pour tout nouveau fichier. Une fois qu'un fichier est détecté, il est chargé et traité à l'aide de plusieurs nœuds, tels que le 'Default Data Loader' et le 'Recursive Character Text Splitter', qui préparent les données pour l'analyse. Ensuite, les données sont enrichies à l'aide de modèles de langage via les nœuds 'Mistral Cloud Chat Model', permettant une compréhension et une génération de texte avancées. Les résultats sont ensuite agrégés et formatés pour être exportés dans le format souhaité.
- Les bénéfices business de ce workflow sont nombreux : il réduit le temps consacré à la manipulation manuelle des fichiers, minimise les erreurs humaines et permet une meilleure réactivité face aux besoins d'analyse et de reporting. En intégrant ce type d'automatisation, les entreprises peuvent améliorer leur efficacité opérationnelle et se concentrer sur des tâches à plus forte valeur ajoutée.
Workflow n8n traitement de données, gestion de documents, extraction de données : vue d'ensemble
Schéma des nœuds et connexions de ce workflow n8n, généré à partir du JSON n8n.
Workflow n8n traitement de données, gestion de documents, extraction de données : détail des nœuds
Inscris-toi pour voir l'intégralité du workflow
Inscription gratuite
S'inscrire gratuitementBesoin d'aide ?{
"meta": {
"instanceId": "26ba763460b97c249b82942b23b6384876dfeb9327513332e743c5f6219c2b8e"
},
"nodes": [
{
"id": "a3af309b-d24c-42fe-8bcd-f330927c7a3c",
"name": "Local File Trigger",
"type": "n8n-nodes-base.localFileTrigger",
"position": [
140,
260
],
"parameters": {
"path": "/home/node/storynotes/context",
"events": [
"add"
],
"options": {
"usePolling": true,
"followSymlinks": true
},
"triggerOn": "folder"
},
"typeVersion": 1
},
{
"id": "048f9d67-6519-4dea-97df-aaddfefbfea2",
"name": "Default Data Loader",
"type": "@n8n/n8n-nodes-langchain.documentDefaultDataLoader",
"position": [
1300,
720
],
"parameters": {
"options": {
"metadata": {
"metadataValues": [
{
"name": "project",
"value": "={{ $('Settings').item.json.project }}"
},
{
"name": "filename",
"value": "={{ $('Settings').item.json.filename }}"
}
]
}
},
"jsonData": "={{ $json.data }}",
"jsonMode": "expressionData"
},
"typeVersion": 1
},
{
"id": "9e9047c9-4428-4afb-8c74-d6eb1075a65a",
"name": "Recursive Character Text Splitter",
"type": "@n8n/n8n-nodes-langchain.textSplitterRecursiveCharacterTextSplitter",
"position": [
1300,
860
],
"parameters": {
"options": {},
"chunkSize": 2000
},
"typeVersion": 1
},
{
"id": "e42e3f82-6cd9-40c4-9da2-8f87ee5b3956",
"name": "Embeddings Mistral Cloud",
"type": "@n8n/n8n-nodes-langchain.embeddingsMistralCloud",
"position": [
1180,
720
],
"parameters": {
"options": {}
},
"credentials": {
"mistralCloudApi": {
"id": "EIl2QxhXAS9Hkg37",
"name": "Mistral Cloud account"
}
},
"typeVersion": 1
},
{
"id": "578c63db-4f6e-4341-ab0d-111debd519be",
"name": "Mistral Cloud Chat Model",
"type": "@n8n/n8n-nodes-langchain.lmChatMistralCloud",
"position": [
2660,
840
],
"parameters": {
"model": "open-mixtral-8x7b",
"options": {}
},
"credentials": {
"mistralCloudApi": {
"id": "EIl2QxhXAS9Hkg37",
"name": "Mistral Cloud account"
}
},
"typeVersion": 1
},
{
"id": "c34adb3e-1fb9-4248-ae83-2bac34c8b0a4",
"name": "Mistral Cloud Chat Model1",
"type": "@n8n/n8n-nodes-langchain.lmChatMistralCloud",
"position": [
1200,
400
],
"parameters": {
"model": "open-mixtral-8x7b",
"options": {}
},
"credentials": {
"mistralCloudApi": {
"id": "EIl2QxhXAS9Hkg37",
"name": "Mistral Cloud account"
}
},
"typeVersion": 1
},
{
"id": "98e6dcc0-1e3a-4119-b657-0949f34ba525",
"name": "Prep Incoming Doc",
"type": "n8n-nodes-base.set",
"position": [
900,
420
],
"parameters": {
"options": {},
"assignments": {
"assignments": [
{
"id": "da64ffde-1e8f-478d-baea-59fc05e6d3ce",
"name": "data",
"type": "string",
"value": "={{ $json.text }}"
}
]
}
},
"typeVersion": 3.3
},
{
"id": "ab88cf9a-d310-4bef-9280-8b23729e7cc9",
"name": "Settings",
"type": "n8n-nodes-base.set",
"position": [
320,
260
],
"parameters": {
"options": {},
"assignments": {
"assignments": [
{
"id": "df327b01-961c-4a49-8455-58c3fbff111a",
"name": "project",
"type": "string",
"value": "={{ $json.path.split('/').slice(0, 4)[3] }}"
},
{
"id": "6b7d26f9-3a38-417e-85d0-4e9d42476465",
"name": "path",
"type": "string",
"value": "={{ $json.path }}"
},
{
"id": "bb4471c7-d894-4739-99a6-4be247794ffa",
"name": "filename",
"type": "string",
"value": "={{ $json.path.split('/').last() }}"
}
]
}
},
"typeVersion": 3.3
},
{
"id": "35c6b678-e6e9-4adf-a904-909fa2401d5e",
"name": "Merge",
"type": "n8n-nodes-base.merge",
"position": [
1600,
420
],
"parameters": {
"mode": "chooseBranch"
},
"typeVersion": 2.1
},
{
"id": "0fa13be8-8500-486c-a1c6-cc1df00a4947",
"name": "Get Doc Types",
"type": "n8n-nodes-base.set",
"position": [
2000,
420
],
"parameters": {
"mode": "raw",
"options": {},
"jsonOutput": "{\n \"docs\": [\n {\n \"filename\": \"study_guide.md\",\n \"title\": \"Study Guide\",\n \"description\": \"A Study Guide is a consolidated resource designed to aid learning. This guide includes three key elements: * A short answer quiz accompanied by an answer key to test comprehension. * A curated list of long-form essay questions to encourage deeper analysis and synthesis of the material. * A glossary of key terms to reinforce understanding of important concepts.\"\n },\n {\n \"filename\": \"timeline.md\",\n \"title\": \"Timeline\",\n \"description\": \"A Timeline organizes all significant events described in the sources you have uploaded in chronological order. This ordered list makes it easier to understand the sequence of events and their connection to the broader context of your sources. In addition to the list of events, the Timeline also provides a “cast of characters,” which comprises short biographical sketches of all the important people mentioned in your uploaded sources. These short biographies can help you quickly grasp the roles of various individuals involved in the events described by the Timeline.\"\n },\n {\n \"filename\": \"briefing_doc.md\",\n \"title\": \"Briefing Doc\",\n \"description\": \"A Briefing Doc identifies and presents the most important facts and insights from the sources in an easy-to-understand outline format. This format is designed to provide a concise overview of the key takeaways from the uploaded materials.\"\n }\n ]\n}\n"
},
"executeOnce": true,
"typeVersion": 3.3
},
{
"id": "e3469368-f214-4549-844e-7febfbbf0202",
"name": "Split Out Doc Types",
"type": "n8n-nodes-base.splitOut",
"position": [
2160,
420
],
"parameters": {
"options": {},
"fieldToSplitOut": "docs"
},
"typeVersion": 1
},
{
"id": "df401e9e-2f70-4079-969b-6b61142fca37",
"name": "For Each Doc Type...",
"type": "n8n-nodes-base.splitInBatches",
"position": [
2340,
420
],
"parameters": {
"options": {}
},
"typeVersion": 3
},
{
"id": "c334b546-8e11-424d-bdd5-006e7086f24b",
"name": "Item List Output Parser",
"type": "@n8n/n8n-nodes-langchain.outputParserItemList",
"position": [
2840,
840
],
"parameters": {
"options": {}
},
"typeVersion": 1
},
{
"id": "4267c2b5-f1cd-4df7-84ee-be01a643a1c1",
"name": "Vector Store Retriever",
"type": "@n8n/n8n-nodes-langchain.retrieverVectorStore",
"position": [
3200,
840
],
"parameters": {},
"typeVersion": 1
},
{
"id": "abf833ec-8a6d-4e13-a526-0ea6b80d578f",
"name": "Embeddings Mistral Cloud1",
"type": "@n8n/n8n-nodes-langchain.embeddingsMistralCloud",
"position": [
3200,
1060
],
"parameters": {
"options": {}
},
"credentials": {
"mistralCloudApi": {
"id": "EIl2QxhXAS9Hkg37",
"name": "Mistral Cloud account"
}
},
"typeVersion": 1
},
{
"id": "a0e50185-6662-4b11-9922-59e8b06e4967",
"name": "Qdrant Vector Store1",
"type": "@n8n/n8n-nodes-langchain.vectorStoreQdrant",
"position": [
3200,
940
],
"parameters": {
"qdrantCollection": {
"__rl": true,
"mode": "list",
"value": "storynotes",
"cachedResultName": "storynotes"
}
},
"credentials": {
"qdrantApi": {
"id": "NyinAS3Pgfik66w5",
"name": "QdrantApi account"
}
},
"typeVersion": 1
},
{
"id": "20c5766a-d3ce-4c01-a76b-facf1a00abc2",
"name": "Mistral Cloud Chat Model2",
"type": "@n8n/n8n-nodes-langchain.lmChatMistralCloud",
"position": [
3100,
840
],
"parameters": {
"options": {}
},
"credentials": {
"mistralCloudApi": {
"id": "EIl2QxhXAS9Hkg37",
"name": "Mistral Cloud account"
}
},
"typeVersion": 1
},
{
"id": "f049b7af-07f3-47e5-9476-68d73a387978",
"name": "Split Out",
"type": "n8n-nodes-base.splitOut",
"position": [
2960,
680
],
"parameters": {
"options": {},
"fieldToSplitOut": "response"
},
"typeVersion": 1
},
{
"id": "39042ae0-e17f-46cd-84be-728868950d84",
"name": "Aggregate",
"type": "n8n-nodes-base.aggregate",
"position": [
3400,
680
],
"parameters": {
"options": {},
"fieldsToAggregate": {
"fieldToAggregate": [
{
"fieldToAggregate": "response.text"
}
]
}
},
"typeVersion": 1
},
{
"id": "e3b900c8-515d-4ac7-88fa-c364134ba9f9",
"name": "Mistral Cloud Chat Model3",
"type": "@n8n/n8n-nodes-langchain.lmChatMistralCloud",
"position": [
3540,
840
],
"parameters": {
"model": "open-mixtral-8x7b",
"options": {}
},
"credentials": {
"mistralCloudApi": {
"id": "EIl2QxhXAS9Hkg37",
"name": "Mistral Cloud account"
}
},
"typeVersion": 1
},
{
"id": "efb26a5d-6a61-44b2-ad99-6d1f8b48998d",
"name": "Discover",
"type": "@n8n/n8n-nodes-langchain.chainRetrievalQa",
"position": [
3100,
680
],
"parameters": {
"text": "={{ $json.response }}",
"promptType": "define"
},
"typeVersion": 1.3
},
{
"id": "302b7523-898e-47af-8941-aa5f8a58fd9c",
"name": "2secs",
"type": "n8n-nodes-base.wait",
"position": [
3880,
1060
],
"webhookId": "ec58ab18-03c5-4b58-bc2e-24415a236c72",
"parameters": {},
"typeVersion": 1.1
},
{
"id": "007857b0-c12c-4c57-b07f-db30526cd747",
"name": "Get Generated Documents",
"type": "n8n-nodes-base.set",
"position": [
2680,
240
],
"parameters": {
"options": {},
"assignments": {
"assignments": [
{
"id": "b38546b2-47c4-4967-a2d7-98aebd589e95",
"name": "data",
"type": "string",
"value": "={{ $json.text }}"
},
{
"id": "a263519a-aa05-410a-b4f0-f5e22cc5058c",
"name": "path",
"type": "string",
"value": "={{ $('Prep For AI').item.json.path }}"
},
{
"id": "ec1687d6-0ea9-460f-b9d4-ae4a7e229e12",
"name": "filename",
"type": "string",
"value": "={{ $('Prep For AI').item.json.name }}"
}
]
}
},
"typeVersion": 3.3
},
{
"id": "36fac35f-df10-41ab-96a7-3a5e67f9d8df",
"name": "Generate",
"type": "@n8n/n8n-nodes-langchain.chainLlm",
"position": [
3540,
680
],
"parameters": {
"text": "=## Document\n{{ $json.text.join('\\n') }}",
"messages": {
"messageValues": [
{
"message": "=Your job is to create a {{ $('For Each Doc Type...').item.json.title }} for the given document. {{ $('For Each Doc Type...').item.json.description }}\n\nGenerate a {{ $('For Each Doc Type...').item.json.title }} for the given document. If questions are generated, generate the answers alongside them. Format your response in markdown; use \"#\" to format headings, use \"*\" to format lists."
}
]
},
"promptType": "define"
},
"typeVersion": 1.4
},
{
"id": "b9a79cb0-bcc1-4d73-af93-5f8d7e2258a9",
"name": "Prep For AI",
"type": "n8n-nodes-base.set",
"position": [
1760,
420
],
"parameters": {
"options": {},
"assignments": {
"assignments": [
{
"id": "5c864125-c884-4d33-b0ed-e3eecd354196",
"name": "id",
"type": "string",
"value": "={{ $('Settings').first().json.filename.hash() }}"
},
{
"id": "93ac14c1-ae97-4ef2-a66f-6c1110f3b0fc",
"name": "project",
"type": "string",
"value": "={{ $('Settings').first().json.project }}"
},
{
"id": "fafd16b9-0002-4f7c-89d0-29788f8ec472",
"name": "path",
"type": "string",
"value": "={{ $('Settings').first().json.path }}"
},
{
"id": "5a5860ba-918b-4fb8-b18c-96c1cd22091a",
"name": "name",
"type": "string",
"value": "={{ $('Settings').first().json.filename }}"
},
{
"id": "1a1caf65-85d8-4f74-a3be-503ccfc0b2c9",
"name": "summary",
"type": "string",
"value": "={{ $('Summarization Chain').first().json.response.text }}"
}
]
}
},
"typeVersion": 3.3
},
{
"id": "e40c7e99-9813-4f06-92bb-dfb2839f1037",
"name": "To Binary",
"type": "n8n-nodes-base.convertToFile",
"position": [
2860,
240
],
"parameters": {
"options": {},
"operation": "toText",
"sourceProperty": "={{ $json.data }}"
},
"typeVersion": 1.1
},
{
"id": "b55df916-7a51-4114-91b8-18a3c6ba2c56",
"name": "Export to Folder",
"type": "n8n-nodes-base.readWriteFile",
"position": [
3020,
240
],
"parameters": {
"options": {},
"fileName": "={{\n $('Get Generated Documents').item.json.path.replace(\n $('Get Generated Documents').item.json.path.split('/').last(),\n $('Get Generated Documents').item.json.filename.substring(0,21) + '...' + $('Split Out Doc Types').item.json.title + '.md'\n )\n}}",
"operation": "write"
},
"typeVersion": 1
},
{
"id": "8490664e-0ca5-4839-ad03-d3f9706c99a3",
"name": "Get FileType",
"type": "n8n-nodes-base.switch",
"position": [
480,
420
],
"parameters": {
"rules": {
"values": [
{
"outputKey": "pdf",
"conditions": {
"options": {
"leftValue": "",
"caseSensitive": true,
"typeValidation": "strict"
},
"combinator": "and",
"conditions": [
{
"operator": {
"type": "string",
"operation": "equals"
},
"leftValue": "={{ $json.fileType }}",
"rightValue": "pdf"
}
]
},
"renameOutput": true
},
{
"outputKey": "docx",
"conditions": {
"options": {
"leftValue": "",
"caseSensitive": true,
"typeValidation": "strict"
},
"combinator": "and",
"conditions": [
{
"id": "3a5f509d-46fe-490c-95f0-35124873c63e",
"operator": {
"name": "filter.operator.equals",
"type": "string",
"operation": "equals"
},
"leftValue": "={{ $json.fileType }}",
"rightValue": "docx"
}
]
},
"renameOutput": true
},
{
"outputKey": "everything else",
"conditions": {
"options": {
"leftValue": "",
"caseSensitive": true,
"typeValidation": "strict"
},
"combinator": "and",
"conditions": [
{
"id": "75188d2f-4bea-44ea-a579-9b9a1bd1ea93",
"operator": {
"type": "object",
"operation": "exists",
"singleValue": true
},
"leftValue": "={{ $json }}",
"rightValue": ""
}
]
},
"renameOutput": true
}
]
},
"options": {}
},
"typeVersion": 3
},
{
"id": "386f7aac-f3b9-4565-907f-687d48b00c52",
"name": "Import File",
"type": "n8n-nodes-base.readWriteFile",
"position": [
320,
420
],
"parameters": {
"options": {},
"fileSelector": "={{ $json.path }}"
},
"typeVersion": 1
},
{
"id": "6ade93d5-61c3-450a-b78c-e210c18c0e70",
"name": "Extract from PDF",
"type": "n8n-nodes-base.extractFromFile",
"position": [
680,
260
],
"parameters": {
"options": {},
"operation": "pdf"
},
"typeVersion": 1
},
{
"id": "f413e139-3f9c-438f-8e82-824c38f09c6b",
"name": "Extract from DOCX",
"type": "n8n-nodes-base.extractFromFile",
"position": [
680,
420
],
"parameters": {
"options": {},
"operation": "ods"
},
"typeVersion": 1
},
{
"id": "455fadea-f5c7-4bea-983f-b06da4e57510",
"name": "Extract from TEXT",
"type": "n8n-nodes-base.extractFromFile",
"position": [
680,
580
],
"parameters": {
"options": {},
"operation": "text"
},
"typeVersion": 1
},
{
"id": "b2586011-4985-4075-b51c-90301b1a8cf9",
"name": "Summarization Chain",
"type": "@n8n/n8n-nodes-langchain.chainSummarization",
"position": [
1200,
260
],
"parameters": {
"options": {},
"chunkSize": 4000
},
"typeVersion": 2
},
{
"id": "1502e72c-e97e-4148-8138-01818ab5b104",
"name": "Sticky Note",
"type": "n8n-nodes-base.stickyNote",
"position": [
60,
85.80882007954312
],
"parameters": {
"color": 7,
"width": 995.1475972814769,
"height": 694.0931000693263,
"content": "## Step 1. Watch Folder and Import New Documents\n[Read more about Local File Trigger](https://docs.n8n.io/integrations/builtin/core-nodes/n8n-nodes-base.localfiletrigger)\n\nWith n8n's local file trigger, we're able to trigger the workflow when files are created in our target folder. We still have to import them however as the trigger will only give the file's path. The \"Extract From\" node is used to get at the file's contents."
},
"typeVersion": 1
},
{
"id": "7b3afc2c-3fb8-4589-9475-78f5617009cc",
"name": "Sticky Note1",
"type": "n8n-nodes-base.stickyNote",
"position": [
1080,
82.96464765818223
],
"parameters": {
"color": 7,
"width": 824.3300768713589,
"height": 949.8141899605673,
"content": "## Step 2. Summarise and Vectorise Document Contents\n[Learn more about using the Qdrant VectorStore](https://docs.n8n.io/integrations/builtin/cluster-nodes/root-nodes/n8n-nodes-langchain.vectorstoreqdrant)\n\nCapturing the document into our vector store is intended for a technique we'll use later known as Retrieval Augumented Generation or \"RAG\" for short. For our scenario, this allows our LLM to retrieve context more efficiently which produces better respsonses."
},
"typeVersion": 1
},
{
"id": "74aabb02-ca5d-41ad-b84f-92d66428b774",
"name": "Sticky Note2",
"type": "n8n-nodes-base.stickyNote",
"position": [
1940,
156.7963650826494
],
"parameters": {
"color": 7,
"width": 591.09953935829,
"height": 485.0226378812345,
"content": "## Step 3. Loop through Templates\n\nWe'll ask the LLM to help us generate 3 types of notes from the imported source document. These notes are intended to breakdown the content for faster study. Our templates for this demo are:\n(1) **Study guide**\n(2) **Briefing document**\n(3) **Timeline**"
},
"typeVersion": 1
},
{
"id": "b96f899d-4a44-491c-b164-a42feba129eb",
"name": "Sticky Note3",
"type": "n8n-nodes-base.stickyNote",
"position": [
2560,
480
],
"parameters": {
"color": 7,
"width": 1500.7886103732135,
"height": 806.6560661824452,
"content": "## Step 4. Use AI Agents to Query and Generate Template Documents\n[Read more about using the Question & Answer Retrieval Chain](https://docs.n8n.io/integrations/builtin/cluster-nodes/root-nodes/n8n-nodes-langchain.chainretrievalqa)\n\nn8n allows us to easily use a chain of LLMs as agents which can work together to handle any task!\nHere the agents generate questions to explore the content of the source document and use the answers to generate the template. "
},
"typeVersion": 1
},
{
"id": "77fda269-6877-422f-b6e6-4346bde862db",
"name": "Sticky Note4",
"type": "n8n-nodes-base.stickyNote",
"position": [
2560,
67.64523011966037
],
"parameters": {
"color": 7,
"width": 771.8710855215123,
"height": 384.22073222791266,
"content": "## Step 5. Export Generated Templates To Folder\n[Learn more about writing to the local filesystem](https://docs.n8n.io/integrations/builtin/core-nodes/n8n-nodes-base.filesreadwrite)\n\nFinally, the AI generated documents can now be exported to disk. This workflow makes it easy to generate any kind of document from various source material and can be used for training and sales."
},
"typeVersion": 1
},
{
"id": "08839972-f0f4-4144-bf27-810664cbf828",
"name": "Qdrant Vector Store",
"type": "@n8n/n8n-nodes-langchain.vectorStoreQdrant",
"position": [
1200,
560
],
"parameters": {
"mode": "insert",
"options": {},
"qdrantCollection": {
"__rl": true,
"mode": "list",
"value": "storynotes",
"cachedResultName": "storynotes"
}
},
"credentials": {
"qdrantApi": {
"id": "NyinAS3Pgfik66w5",
"name": "QdrantApi account"
}
},
"typeVersion": 1
},
{
"id": "7e216411-83ee-4b82-9e00-285d4f2d3224",
"name": "Sticky Note5",
"type": "n8n-nodes-base.stickyNote",
"position": [
-360,
80
],
"parameters": {
"width": 390.63004227317265,
"height": 401.0080676370763,
"content": "## Try It Out! \n\n### This workflow automates generating notes from a source document.\n* It watches a target folder to pick up new files.\n* When a new file is detected, it saves the contents of the file in a vectorstore.\n* multiple AI agents guided by a templates list, generate the predetermined notes.\n* These notes are then export alongside the original source file for the user.\n\n\n### Need Help?\nJoin the [Discord](https://discord.com/invite/XPKeKXeB7d) or ask in the [Forum](https://community.n8n.io/)!\n\nHappy Hacking!"
},
"typeVersion": 1
},
{
"id": "f2c363d3-a2bf-4468-ad54-f26649ce6ab8",
"name": "Interview",
"type": "@n8n/n8n-nodes-langchain.chainLlm",
"position": [
2660,
680
],
"parameters": {
"text": "=## document summary\n {{ $('Prep For AI').item.json.summary }}",
"messages": {
"messageValues": [
{
"message": "=Given the following document summary, what questions would you ask to create a {{ $('For Each Doc Type...').item.json.title }} for the document? Generate 5 questions."
}
]
},
"promptType": "define",
"hasOutputParser": true
},
"typeVersion": 1.4
},
{
"id": "ce3da55d-8c22-40bb-8781-63c2e6bcb824",
"name": "Sticky Note6",
"type": "n8n-nodes-base.stickyNote",
"position": [
1960,
380
],
"parameters": {
"width": 172.26820279743384,
"height": 295.46359440513226,
"content": "\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n### 💡Add your own templates here!\n"
},
"typeVersion": 1
}
],
"pinData": {},
"connections": {
"2secs": {
"main": [
[
{
"node": "For Each Doc Type...",
"type": "main",
"index": 0
}
]
]
},
"Merge": {
"main": [
[
{
"node": "Prep For AI",
"type": "main",
"index": 0
}
]
]
},
"Discover": {
"main": [
[
{
"node": "Aggregate",
"type": "main",
"index": 0
}
]
]
},
"Generate": {
"main": [
[
{
"node": "2secs",
"type": "main",
"index": 0
}
]
]
},
"Settings": {
"main": [
[
{
"node": "Import File",
"type": "main",
"index": 0
}
]
]
},
"Aggregate": {
"main": [
[
{
"node": "Generate",
"type": "main",
"index": 0
}
]
]
},
"Interview": {
"main": [
[
{
"node": "Split Out",
"type": "main",
"index": 0
}
]
]
},
"Split Out": {
"main": [
[
{
"node": "Discover",
"type": "main",
"index": 0
}
]
]
},
"To Binary": {
"main": [
[
{
"node": "Export to Folder",
"type": "main",
"index": 0
}
]
]
},
"Import File": {
"main": [
[
{
"node": "Get FileType",
"type": "main",
"index": 0
}
]
]
},
"Prep For AI": {
"main": [
[
{
"node": "Get Doc Types",
"type": "main",
"index": 0
}
]
]
},
"Get FileType": {
"main": [
[
{
"node": "Extract from PDF",
"type": "main",
"index": 0
}
],
[
{
"node": "Extract from DOCX",
"type": "main",
"index": 0
}
],
[
{
"node": "Extract from TEXT",
"type": "main",
"index": 0
}
]
]
},
"Get Doc Types": {
"main": [
[
{
"node": "Split Out Doc Types",
"type": "main",
"index": 0
}
]
]
},
"Extract from PDF": {
"main": [
[
{
"node": "Prep Incoming Doc",
"type": "main",
"index": 0
}
]
]
},
"Extract from DOCX": {
"main": [
[
{
"node": "Prep Incoming Doc",
"type": "main",
"index": 0
}
]
]
},
"Extract from TEXT": {
"main": [
[
{
"node": "Prep Incoming Doc",
"type": "main",
"index": 0
}
]
]
},
"Prep Incoming Doc": {
"main": [
[
{
"node": "Qdrant Vector Store",
"type": "main",
"index": 0
},
{
"node": "Summarization Chain",
"type": "main",
"index": 0
}
]
]
},
"Local File Trigger": {
"main": [
[
{
"node": "Settings",
"type": "main",
"index": 0
}
]
]
},
"Default Data Loader": {
"ai_document": [
[
{
"node": "Qdrant Vector Store",
"type": "ai_document",
"index": 0
}
]
]
},
"Qdrant Vector Store": {
"main": [
[
{
"node": "Merge",
"type": "main",
"index": 1
}
]
]
},
"Split Out Doc Types": {
"main": [
[
{
"node": "For Each Doc Type...",
"type": "main",
"index": 0
}
]
]
},
"Summarization Chain": {
"main": [
[
{
"node": "Merge",
"type": "main",
"index": 0
}
]
]
},
"For Each Doc Type...": {
"main": [
[
{
"node": "Get Generated Documents",
"type": "main",
"index": 0
}
],
[
{
"node": "Interview",
"type": "main",
"index": 0
}
]
]
},
"Qdrant Vector Store1": {
"ai_vectorStore": [
[
{
"node": "Vector Store Retriever",
"type": "ai_vectorStore",
"index": 0
}
]
]
},
"Vector Store Retriever": {
"ai_retriever": [
[
{
"node": "Discover",
"type": "ai_retriever",
"index": 0
}
]
]
},
"Get Generated Documents": {
"main": [
[
{
"node": "To Binary",
"type": "main",
"index": 0
}
]
]
},
"Item List Output Parser": {
"ai_outputParser": [
[
{
"node": "Interview",
"type": "ai_outputParser",
"index": 0
}
]
]
},
"Embeddings Mistral Cloud": {
"ai_embedding": [
[
{
"node": "Qdrant Vector Store",
"type": "ai_embedding",
"index": 0
}
]
]
},
"Mistral Cloud Chat Model": {
"ai_languageModel": [
[
{
"node": "Interview",
"type": "ai_languageModel",
"index": 0
}
]
]
},
"Embeddings Mistral Cloud1": {
"ai_embedding": [
[
{
"node": "Qdrant Vector Store1",
"type": "ai_embedding",
"index": 0
}
]
]
},
"Mistral Cloud Chat Model1": {
"ai_languageModel": [
[
{
"node": "Summarization Chain",
"type": "ai_languageModel",
"index": 0
}
]
]
},
"Mistral Cloud Chat Model2": {
"ai_languageModel": [
[
{
"node": "Discover",
"type": "ai_languageModel",
"index": 0
}
]
]
},
"Mistral Cloud Chat Model3": {
"ai_languageModel": [
[
{
"node": "Generate",
"type": "ai_languageModel",
"index": 0
}
]
]
},
"Recursive Character Text Splitter": {
"ai_textSplitter": [
[
{
"node": "Default Data Loader",
"type": "ai_textSplitter",
"index": 0
}
]
]
}
}
}Workflow n8n traitement de données, gestion de documents, extraction de données : pour qui est ce workflow ?
Ce workflow s'adresse aux entreprises qui manipulent régulièrement des fichiers, notamment dans les secteurs de la gestion de documents, de l'analyse de données et du développement de contenu. Il est particulièrement adapté aux équipes techniques et aux professionnels ayant une compréhension intermédiaire des outils d'automatisation. Les PME et grandes entreprises trouveront également une valeur ajoutée dans cette solution.
Workflow n8n traitement de données, gestion de documents, extraction de données : problème résolu
Ce workflow résout le problème de la gestion manuelle des fichiers, qui peut être chronophage et sujet à des erreurs. En automatisant le processus d'extraction et de génération de documents, les utilisateurs éliminent les frustrations liées à la manipulation de données, réduisent les risques d'erreurs et obtiennent des résultats plus rapidement. Après mise en place, les utilisateurs peuvent s'attendre à une amélioration significative de leur productivité et de la qualité des données traitées.
Workflow n8n traitement de données, gestion de documents, extraction de données : étapes du workflow
Étape 1 : Le workflow démarre avec le déclencheur 'Local File Trigger' qui surveille un dossier pour de nouveaux fichiers.
- Étape 1 : Une fois un fichier détecté, il est chargé via le 'Default Data Loader'.
- Étape 2 : Les données sont ensuite préparées et segmentées grâce au 'Recursive Character Text Splitter'.
- Étape 3 : Les modèles de langage Mistral Cloud sont utilisés pour enrichir les données et générer des réponses.
- Étape 4 : Les résultats sont agrégés et formatés pour l'exportation, permettant une utilisation immédiate des informations traitées.
Workflow n8n traitement de données, gestion de documents, extraction de données : guide de personnalisation
Pour personnaliser ce workflow, commencez par ajuster le chemin du dossier dans le nœud 'Local File Trigger' pour qu'il corresponde à votre répertoire de fichiers. Vous pouvez également modifier les paramètres des nœuds de traitement, tels que le 'chunkSize' dans le 'Recursive Character Text Splitter', pour adapter la taille des segments de données. Si vous souhaitez intégrer d'autres types de fichiers, vous pouvez ajouter ou modifier les nœuds d'extraction, comme 'Extract from PDF' ou 'Extract from DOCX'. Enfin, assurez-vous de configurer les nœuds d'exportation pour qu'ils correspondent à vos besoins spécifiques en matière de format de fichier et de destination.