
It’s a game of 2 halves
This article builds on a previous a post in which I walked through the process of setting up an Azure Function App so that functions defined within it can be called securely from an SPFx client (web part or command set extension). If you need help on this first part, I suggest you start here:
Calling an Azure Function App from the SharePoint Framework
This article picks up from the above and provides a walkthrough on how to call Azure AI services from functions in an Azure Function App so that an SPFx solution can successfully call AI models using these functions as a go-between.
My original article ended up being too long, so I split in 2.
Part 1: Calling Azure AI APIs from a Function App: Pt1 (Key Concepts) deals with all the conceptual stuff.
I think this is important, so I would urge you to read it first, as it explains the rationale for what we need to set up and configure in Azure, but if you just want a step-by-step guide on how to build and SPFx solution with integrated AI features, you are in the right place.
Briefly Setting the Scene
However, you can’t jump in to this completely devoid of context, so please use the following dot points as a re-cap or to catch up with what I are doing here:
- I am developing a Knowledge Based (KB) application to be hosted in SharePoint using the SharePoint Framework (SPFx) called K-Docs Publish.
- The solution consist of 2 solution packages:
- Publisher: An SPFx Command Extension which allows users to select Word documents and convert them into HTML.
- Viewer: An SPFx Web Part (single-webpart-app-page) which allows users to:
- View the HTML rendition of the published Word documents.
- Browse and navigate between articles.
- Provide in-article navigation tools (such as a navigable table-of-contents, a la Wikipedia).
- Apply custom style mappings, so we can brand the solution.
- And (crucially for this article) provide a way to call AI services so users can ask questions and get useful responses, grounded on knowledge contained within the KB.
- To provide a useful AI capability I needed to break up published articles into discrete chunks, based on the header structure of the source documents, into what I call Units of Knowledge (UoK) – pronounced ‘oiks’.
- I use an AI text-embedding model to generate a search vector for each UoK when a source document is published.
- I call the same text-embedding model to return a vector for user input (their questions) at run time and (Cosine) compare the user input vector with vectors generated for each UoK to find the top 10 (or so) that are most relevant.
- I then call a Response Augmented Generation (RAG) AI model with:
- The user input.
- The most relevant UoKs (used to generate the response, grounded in the context of the KB).
- The thread history (so as to retain a running context).
- Grounding text (which amounts to a set of instructions telling my RAG model how to behave).
- Format instructions for the response (summary, dot-points, HTML, markdown etc.).
The way I do this is to:
- Provision a Function App in Azure with functions that can be called from a trusted SPFx client (the viewer web part).
- Set up the necessary AI models in the Azure AI Foundry.
- Call the AI models from the functions in the Function App.
- Return the response to users of the SPFx solution.
As mentioned above, my previous article Calling an Azure Function App from the SharePoint Framework covered how to provision and configure a Function App secured with Entra ID and safely call functions defined within it from an SPFx solution. Consider this a precursor to what I cover in the rest of this post and so if you are not au fait with Azure Functions, I urge you to start there.

This article is focused on the technical steps needed to call AI models, that we will deploy in the Azure AI Foundry, from functions defined in an Azure Function app so that we can complete the processing pipeline depicted above, i.e. SPFx to Azure Function to AI Model, back to Azure Function and finally returning the response to the SPFx client.
Enough already, let’s build!
I’ll start with text-embedding and then move onto the RAG. Although the process for setting up the necessary models in the Azure AI Foundry is similar, the data we pass to each model, and the response that is returned is of course different.
However before we can deploy any AI models we need to a project in the Azure AI Foundry.
Set up a Foundry Project
From the Foundry blade in the Azure portal click on the Create button,

From the Create a Foundry resource wizard provide the appropriate settings for your project. I decided to run with a simple KDocs for my resource name and K-Docs for project name, thinking that I might at some point add AI capabilities to other products in the K-Docs family and would likely want them to all be managed from a single project.

I accepted the defaults for the remainder of the settings, and at the end of the wizard clicked on the Create button to start the provisioning process.
A minute or two later, the UI reports everything is set up and provides me with a Go to resource button.

Rather confusingly, when you go to the resources (or subsequently click the link to access the project), the page presents you with the Go to Foundry portal button.

The confusing part, for me at least, is that I thought I was already in the Foundry, but it seems that the Foundry exists as 2 layers:
- Azure Foundry (in Azure) where you create and manage projects.
- Microsoft Foundry which is like the workspace and configuration hub for each Foundry project.
Click on the Go to Foundry portal button and you will see that I am transported to the configuration workspace for my project.

Why it feels alien is because we are not in https://portal.azure.com/ anymore, we are now at https://ai.azure.com/ and there is no obvious way back to the main Azure site. If you click on the root breadcrumb linke link you don’t go back to the Azure Foundry (where your projects are listed) but rather you end up at the root of the ai.zure.com (where your projects are also listed) and which Microsoft chose to call the Microsoft Foundry, just to make things super intuitive – not!

Until I got my head around what they have done, and in case you missed my sarcasm, I found this to be very counter intuitive. Moving right along!
The Text Embedding Model
This section steps through the process of provisioning and deploying a text-embedding model and testing it, first in Postman and then in the SPFx text harness that I used previously.
If you don’t know what a text-embedding model is, or why we need one, go read Part 1.
Provisioning a Text Embedding Model
From the KDocs project in the Microsoft Foundry (not to be confused with the Azure Foundry) select the Model catalog menu item and search for “text-embedding”.

The model I need is called text-embedding-3-small (highlighted in green above).
You can see that there are a couple of other text-embedding models I could have chosen but Copilot told me that text-embedding-3-small was the best option for what I needed as it’s fast and cheap – I like the sound of that.
Click on the text-embedding-3-small button and you will be directed to the following screen:

Click on the Use this model button and the Deploy text-embedding-3-small dialog will appear.

You can name your model anything you like but I just ran with the default value as provided.
Click on the Deploy button to deploy this model as an asset that can be used from your project.

Setting up Environmental Variables
In my previous article on Azure Functions, I create a web part solution as a text harness and established that I could successfully call a text function from my SPFx solution.
We now need to take this one step further and create another function but rather than return a simple text response directly from the function, we need it to call the above newly provisioned text-embedding model. We will provide with some sample text and get a vector in response.
However before we can do that, we need to set up environmental variables for the following:
- Endpoint: The endpoint of the text-embedding model.
- API Key: The API key to allow us to access the end point
- Deployment: The name of the model we are calling.
- API Version: The version of the API that we are calling.
You pick up the values for these variables from the controls highlighted in the previous screenshot.
It is not at all obvious where to get the API version from, but you can see that it is included as a param of the provided end point URI.
Go to the Function App in Azure and select the Environmental variables item from the Settings menu.

Click the Add button to add the required variables:




Make sure to click the Apply button to save the variables and then the Confirm button.

You will end up with the environmental variables listed below that can be accessed from the function we will set up next.

Creating the Function to call the Text Embedding Model
From the Functions tab in the Overview blade click the Create button and select the HTTP trigger template.

Click the Next button and from the Template details tab, give the function a name. I called mine Vectorize and set the Authorization level to Anonymous.
That last part might seem strange, but the Function App is already protected by Entra ID and in this case, setting it to Anonymous follows Microsoft’s best practice guidance.

Click the Create button at the bottom of the panel to provision the function.
When the process is complete, the browser will load the boiler plate code.

This is the same code I used for the test function in my previous article.
The Text-Embedding Function Code
We need to update the function code so that it calls the text embedding model.
Just click in the code editor and delete the default boiler plate code and replace it with the code shown below:
module.exports = async function (context, req) {
const endpoint = process.env.AZURE_COG_ENDPOINT; // https://kdocs.cognitiveservices.azure.com
const apiKey = process.env.AZURE_COG_KEY; // Cognitive Services key
const apiVersion = process.env.AZURE_COG_API_VERSION; // 2023-05-15
const deployment = process.env.EMBEDDING_DEPLOYMENT_NAME; // text-embedding-3-small
const url = `${endpoint}/openai/deployments/${deployment}/embeddings?api-version=${apiVersion}`;
if (!req.body || !req.body.inputs) {
context.res = {
status: 400,
body: { error: “Request body must contain an ‘inputs’ array.” }
};
return;
}
try {
const response = await fetch(url, {
method: “POST”,
headers: {
“Content-Type”: “application/json“,
“api-key”: apiKey
},
body: JSON.stringify({
input: req.body.inputs
})
});
if (!response.ok) {
const errorText = await response.text();
context.res = {
status: response.status,
body: { error: errorText }
};
return;
}
const data = await response.json();
context.res = {
status: 200,
body: {
embeddings: data.data,
usage: data.usage
}
};
} catch (err) {
context.res = {
status: 500,
body: { error: err.message }
};
}
};
You might be wondering why we bother with Environmental Variables as you could simply hard-code the url and the api key etc. This would work but it makes the solution less secure as the mapping of variable is localised to servers on the tenancy and so there is no risk of the values being intercepted on the client. Plus, I can potentially reuse these variable with other functions if I need to.
Test in Postman
It is useful to confirm all is well with function by testing it using Postman. Please read my first article for guidance on setting up your environment to by Postman compatible, so that you can make authenticated function calls.
Once you have acquired a valid access token for you Postman call, set up the raw text of the body so that is in the format expected by your call to the function and passed on the text-embedding model.

Remember that the end point you are hitting here is the endpoint of the function and not the AI model and you get that from the Get function URL button from the Code + Test page.

All 3 text boxes contain the same URL, so it doesn’t matter which one you copy.
Click the Send button in Postman and if all is well you get response like that shown in the screenshot below:

The embedding property of the embeddings property of the response will contain the array of 1536 numbers.
If you look closely at the code in the function you will see that in addition to the data.data we are also fetching the data.usage information:

As can be seen from the above, data.data is what gets returns as the embeddings property but if you scroll to the end of the response sent back to postman, you can view the usage information.

In this case, the call used 7 tokens, which happens to be the number of words in my test input, but that was simply a coincidence.
In case you are wondering how much it cost me to make this call; you need to check on token pricing for the model and at the time of writing that was $0.00002 per 1,000 tokens (or 2 cents per 100K tokens).
I used just 7 tokens and so the call cost me $0.00000014 USD or 14 ten‑millionths of a dollar – essentially free!
I’ll come back to costings later, but rest assured this isn’t going to break the bank!
Test from SPFx Web Part
I enhanced my SPFx text harness web part, to make sure that I was able to successfully make a similar call from an SPFx solution (rather than Postman).
The screenshot below shows text harness web part UI:

The important part of my Web Part code is the asynchronous method callVecrtorizeFunction() which gets called when the Text Vectorize button is clicked, as shown below:
private async callVectorizeFunction(): Promise<void> {
const tokenProvider = await this.webPartContext.aadTokenProviderFactory.getTokenProvider();
const applicationID: string = “b2694200-e2f9-4b8f-a748-f2eb2ccae54c“;
const clientURI: string = `api://${applicationID}`;
const token = await tokenProvider.getToken(clientURI);
const functionEndPointUrl: string = “https://kaboodle360-k-docs-publish-asf6b5bkehcwc6f5.australiasoutheast-01.azurewebsites.net/api/vectorize”;
const response = await fetch(
functionEndPointUrl,
{
method: “POST”,
headers: {
“Authorization”: `Bearer ${token}`
},
body: JSON.stringify({
inputs: [this.vectorizeInputText]
})
}
);
this.responseJson = await response.json();
}
I have not included the rest of the code for the test harness as they are just standard fluent UI controls which store input and response values in state.
Note that to successful call the function we need two key values:
- The end point URL of the function
- The application ID of the Function App
Although I have hard coded these values in my test harness, for a real-world solution you would need want to provide a way to make these variables, maybe by setting up web part properties or saving them in a configuration file somewhere.
If you are maybe concerned about storing the endpoint or application ID on the client, you need not worry because, the Function App is secured by Entra ID and with how we have set things up, a call to the vectorize function will only be successful if:
- It is executed by a user who is bears an access token i.e. they have already been authentication in your tenancy.
- It is call from a client application that has been approved by a tenancy administrator.
The RAG Model
This sections steps through the process of provisioning and deploying a RAG AI model and once again we will test it in Postman first, before calling it from the SPFx text harness web part.
Provisioning the RAG Model
We provision the RAG AI model in more or less the same was as with the text embedding model, described in the previous section.
However, just to mix things up, this time I am going to use the New Foundry experience, which we can use to achieve the same outcome.

I do wish Microsoft would stop their continual updates of the UI. Not only is it very disorientating but if you try and use AI to guide you to do anything, there is always a lag time between when Microsoft release an update and when AI picks up on it. It is incredibly frustrating to be continually telling the AI that the links and buttons which it insists are there, simply no longer exist – rant over!
Click on the Find models tile.
From background research I knew I would be needing a Generative Pre-Trained Transformer (GPT) model; but which one, there are (at the time of writing) 40 to choose from!

Some of them, with short descriptions of Audio generation and Text to image are clearly not what I am after but that still leaves many option in the Chat completion, Responses category.
You would think that the newer the model (as indicated by the version number) you’d get better quality, but that’s not strictly true. After a bit of to-and-fro with Copilot, it settled on this advice:

So I decided to run with the recommended GTP-4.1-mini model, but if quality trumps cost and speed, you can opt for the non-mini version.
If you are setting things up for K-Docs Publish you might also provision multiple models of different types and let admins select the model as a configuration setting or even let users select the model themselves using the model picker which amounts to a simple speed vs quality control.
Moving on, select the GTP-4.1-mini model tile from the list.

Click the Deploy button, to, well, deploy the model.

I went with the Default settings option and a few moments later my model was deployed. The new UI comes with a chat box all ready to go!

Environmental Variables
This time we only need to set up a couple of environmental variable. One for model API key and the other for the Target URI (the project endpoint).
You might legitimately ask why we only need 2 environmental variables here when previously we needed 4 for the Vectorize function. Good question. It turns out that not all AI models get called in the same way and how you provision the model (using the current or new UI) makes a difference, although I am far from certain about this last point.
The onus is on you to figure out what it is that your API calls need for them to run correctly, which is why I strongly recommend that you use Postman.
The good news is that from the SPFx client things are consistent, we only need the end point URL of your functions and the Application ID of your Function App. Remember, we are calling Azure functions form the SPFx client and that’s all we need to worry about from the client. How, the function calls your AI models is something we do server-side and is why I urge you to test things with Postman.
You pick up these property values from the Details tab in the model page.

If you leave the Foundry, it can be quite tricky to work your way back to this page, but what you need to do is access the Models tab of the Assets blade of the Foundry project and then click on the link for your model.

Once you have picked up the model details head back to the Function App and add the new Environmental variables as we did for the text-embedding model, previously.
One for the API Key.

And a second for the Target URI:

Don’t forget to click the Apply button (followed by a confirm action dialog) to save the entries.

Creating the RAG Function
As with the text-embedding model we need to set up a function in the Azure Function app, so we can call the RAG AI model.
Click the + Create action button on the Functions tab from the Overview blade in your Function App.

From the Create Function panel, as before, select the HTTP trigger function and then click the Next button.

As before, give the function and meaningful name and set the Authorization level to Anonymous.

And then click the Create button to provision the new function.
Note that you must provide a callable function name i.e. no spaces or character patterns are allowed, although the panel text box will happily let you create an unacceptable function name, but it will then fail to provision it!
A few moments later your boiler-plat function will be set up.

The RAG Function Code
Here is the code I set in the function
module.exports = async function (context, req) {
try {
const endpoint = process.env.RAG_PROJECT_URI;
const apiKey = process.env.KEY_GPT41MINI;
const modelName = req.body.modelName || “gpt-4.1-mini”;
const question = req.body.question;
const chunks = req.body.contextChunks || [];
const instructions = req.body.instructions || “”;
const format = req.body.format || “”;
const history = req.body.history || [];
const contextText = chunks.join(“\n\n“);
const historyMessages = history.map(m => ({
role: m.role,
content: m.content
}));
// Build final message array
const messages = [
{
role: “system”,
content: instructions
},
{
role: “system”,
content: `CONTEXT:n${contextText}`
},
…historyMessages,
{
role: “user”,
content: `
QUESTION:
${question}
INSTRUCTIONS:
${instructions}
FORMAT:
${format}
`
}
];
const response = await fetch(endpoint, {
method: “POST”,
headers: {
“Content-Type”: “application/json“,
“api-key”: apiKey
},
body: JSON.stringify({
model: modelName,
input: messages
.map(m => `${m.role.toUpperCase()}: ${m.content}`)
.join(“nn“),
temperature: 0.2,
max_output_tokens: 800,
top_p: 1
})
});
const data = await response.json();
context.res = {
status: 200,
body: data
};
} catch (err) {
context.res = {
status: 500,
body: { error: err.message }
};
}
};
This function code is simple enough. We read the values from the Environmental Variables so that we can call the RAG model API.
The rest of the data we need will come in through the body attribute of req parameter. Initially we will pass some data from Postman and once we have that working, we can run a similar test call from the SPFx Test harness solution.
Although in my code, context information comes in as an array of text fragments (chunks from the most relevant items in the Extracts list – my UoK) we need to join them into a single string, because that’s what the RAG model expects.
I chose to send this context as an array of strings because that is how I deal with them in the SPFx client, but I guess I might have joined them as a single string from the client instead of joining them here in the function.
Note that this is how I send information from K-Docs Publish and so if you are setting up a function for this solution, you will need to follow this pattern and expect to receive the content information as chunks (an array of text values), which you will need to join.
Note also this line of code:
const modelName = req.body.modelName || “gpt-4.1-mini”;
This defaults to the model name we defined in the Microsoft Foundry but provides the option to pass in a different model name from the client.
In other words, this code supports the possibility that we have deployed more than one RAG model, and we can choose which is called by passing the desired model name from the client. The list of model names that might be called must be managed from the client and of course they must map to a valid and active model that expects data in the same format.
Make sure to save the function.
Test from Postman
As before we will want to make a test call from Postman, to confirm that the plumbing and wiring is as it should be.
Make sure to authenticate with your tenancy and bag an access token. Then set up a suitable body payload and the paste in the target function endpoint Url.

Click the Send button and you should get back a response that is something like what is shown below:

The bit that you will mainly be interested in is the text value, highlighted above. To get at that you’ll need to walk the properties of the returned response object.
Testing for the Test Harness
In the test harness we need to set up state variables and use various text box controls to edit properties that can then be passed to the function.
I have spilt the test UI over 2 tabs in a Pivot control. The second tab is used to provide the controls needed to set context information, apply assistant instructions and issue formatting directives.

On the first tab I have set up controls that will accept user input and then render the response as part of the dialog history.

Click the Ask button to call an asynchronous method like before:
private async callRAGFunction(): Promise<void> {
const tokenProvider = await this.webPartContext.aadTokenProviderFactory.getTokenProvider();
const applicationID: string = “b2694200-e2f9-4b8f-a748-f2eb2ccae54c”;
const clientURI: string = `api://${applicationID}`;
const token = await tokenProvider.getToken(clientURI);
const functionEndPointUrl: string = “https://kaboodle360-k-docs-publish-asf6b5bkehcwc6f5.australiasoutheast-01.azurewebsites.net/api/K-Docs_Publish_RAG”;
const response = await fetch(
functionEndPointUrl,
{
method: “POST”,
headers: {
“Authorization”: `Bearer ${token}`
},
body: JSON.stringify({
question: this.ragInput,
contextChunks: this.ragContextChunks,
instructions: this.ragInstructions,
format: this.ragFormat,
history: this.ragHistory
})
}
);
const aiResponse = await response.json();
const responseText: string = this.responseValue(aiResponse)
const history: IHistoryItem[] = this.ragHistory;
const userInput: IHistoryItem = {
role: “user”,
content: this.ragInput
};
const assistantResponse : IHistoryItem = {
role: “assistant”,
content: responseText,
}
history.push(userInput, assistantResponse);
this.setState({ragHistory: history, ragInput: “”});
}
This will then call the RAG endpoint which sits at a different URL to the vectorize function but note that because both functions are in the same Azure Function, the applicationID is the same as before.
And here is the output

How cool is that!
I won’t go through the code for wiring up the controls to capture user input and formatting it and the response data so that it appears in a presentable format in UI – consider that your homework!
Having fun
In K-Docs Publish the settings for how a model is called is in the hands of the site administrator as I consciously decided to remove such complexities from the UI for end users.
However admins can create multiple assistants, each with their own instructions and personality and you can use the test harness to experiment with the various settings and directives issues to your assistant.
This can be quite fun (well it amuses me at least). When I add some additional wacky guidance to the assistant:

I get:

Or how about a from my lazy cat, Brian?


Or when you ask something not in the context text.

A cat with attitude, just what I need!
Have fun asking for response in Shakespearian prose or Dickensian English or as a drunk Scotsman and even something likely to be a bit more useful such as asking for responses in a different language


Now that’s Magic!
I don’t know about you, but I find this fun and truly amazing – magical even!
Ok, I know you can do this in Copilot or Chat GPT and the rest, but the difference here is that the information on which the responses are based is provided by you (or your application) and so is grounded, focused, relevant, accurate and far less prone to hallucinations, so long as you take care in how you call these models.
If you still haven’t read Part 1, then please take the time to go it through as it explains my strategy for ensuing that the responses provided by K-Docs Publish are indeed accurate and useful because they are ground the context specific information with your KB.
The cost?
We have already seen that the text-embedding model is incredibly cheap, but what about the cost of using the RAG model?
Well that depends on usage of course, but Copilot reckons the following:
- With 100 users,
- each asking 10 questions per day,
- with 22 workdays per month
That amounts to ≈ $6 per month (i.e. 100×10×22×0.00027 = 5.94)
If you are worried about cost control, you can set spending guardrails on the model, but I’m thinking that if you have over 1000 users accessing this service daily, you will have a runaway successful application on your hands and a powerful argument to tell management that this money well spent!
K-Docs Publish
I must place a temporary dampener on K-Docs Publish as the AI features I have described in this article are still in the prototype stage, but I promise they will be available in the next public release which we hope to have ready in May 2026. If all goes well, you should see it in the Microsoft Marketplace in June.
If you can’t wait until then, please reach out and I will see what we can do about enrolling you in our beta program.
Remember, all Kaboodle products are either Free or (as in this case) Freemium. Meaning that you can download and deploy a fully functional solution at zero licensing cost (from Kaboodle that is – you’ll still have to pay Microsoft a few shekels to call their APIs).
The Freemium product contains no adware, and you don’t need a credit card. You don’t even need to register, just download and use. All we ask for is constructive feedback and that you spread the word through your social media networks if you find our products useful.
You only need to purchase a license if you want to access enhanced features. And you only need to register if you want to trial the full product or buy a full license.
Check out these useful links:
- Product page (for downloads): https://kaboodle.software/Solutions/K-Docs/K-Docs-Publish
- Product video (You Tube): https://youtu.be/mF3mM7woeNE
- Register for a Trial: https://kaboodle.software/Trial
- Licensing Policy: https://kaboodle.software/Licensing
- Buy a full license or get a quote: https://kaboodle.software/Buy
- Contact us: https://kaboodle.software/Contact
