My dad always told me, “Be the change you want to see in the world” and “Stop leaving the door open, we’re not paying to cool the entire neighborhood.” Now, I can’t advise you on your electric bill, but I can share a little bit on how I stepped into a proof of concept for an interesting challenge when it comes to content translation.

If you’ve been doing Sitecore for any period of time, you’ve likely come across a multi-lingual site. Multiple versions exist for the same item, but in different languages. Content creation and translation in the sites goes one of two ways: Manual or Automated. The manual world is quite a bit more work, so most clients end up going with an automated solution. A couple I’ve worked with in the past are Smartling and LionsBridge. Both do a great job of integrating with Sitecore, and both have a robust team to translate content and push it back into your Sitecore instance. Now, this isn’t an advertisement for these products (it’s a Tide Ad), more a point of “where we are today”. But let’s talk about the future!

The future of Sitecore is XM Cloud, a fully SaaS-based product that eliminates your needs for upgrades. Pretty neat! Being fully SaaS, however, does come with some limitations. One of those is you don’t have access to your SQL instance. This makes sense though, given the nature of the beast. You for sure don’t want to break something down at the database level. Why does that matter for content translation? Well, both of the mentioned products actually modify your database, or stand up a new database altogether to track translated versions.

Church Lady has concerns

I brought this concern up in slack, and one of the responses was “Well, use ChatGPT, Sitecore Connect and the Content Authoring APIs!” To which I mentally said a couple choice words because that’s a bit…complex. Probably more complex than I could tackle in a day or two, which is about all I get to focus on a problem these days due to a mixture of adult ADD and actual work to do. But, after a binge session (yes, I have time for TV) of Last Week with John Oliver, I figured…maybe let’s just look into it.

I wanted to put a twist on this tinker though. I didn’t want to get to APIs and Connect, and all the things that come with the proposed solution. Given you can use curl to interface with ChatGPT, I figured I should be able to do this with PowerShell, and with that, I should be able to use Sitecore PowerShell Extensions (PSE from here on out, despite the SEO hit).

What did I need to do to get started? Here’s my To-Do:

  1. Communicate with Chat GPT over curl. Just stupid hello-world type things
  2. Translate Text with curl, and ensure it is valid.
  3. Build a simple UI in SPE to let the user pick what content to translate
  4. Put it all together
  5. ???
  6. Profit

Ok, so step one, communicate with ChatGPT. This one wasn’t too hard. I already had a login. You can snag one by going here: https://chat.openai.com/auth/login. Simple enough! Once you’re in the platform, you’ll need an API Key. Click on over to https://platform.openai.com/account/api-keys to register a new one. Jot it down, though. Now, let’s do a test! Here’s a basic curl command:

curl https://api.openai.com/v1/chat/completions -H "Content-Type: application/json" -H "Authorization: Bearer API_KEY" -d '{"model": "gpt-3.5-turbo", "messages": [{"role": "user", "content": "Hello!"}]}'

You’ll get a nifty response like the following:

{
  "id": "chatcmpl-6wz1vmmOV1lcxnQfvYutEm6U6bRYq",
  "object": "chat.completion",
  "created": 1679515643,
  "model": "gpt-3.5-turbo-0301",
  "usage": {
    "prompt_tokens": 9,
    "completion_tokens": 11,
    "total_tokens": 20
  },
  "choices": [
    {
      "message": {
        "role": "assistant",
        "content": "\n\nHi there! How can I assist you today?"
      },
      "finish_reason": "stop",
      "index": 0
    }
  ]
}

Check out your response in the choices\messages\content node. This is ChatGPT, talking back to you. It sounds super helpful. That is, until it sends robots back in time to kill you so you don’t give birth to the future rebellion leader who eventually shuts down Skynet.

RoboCop was way less creepy, imo

Ok, so Step 1, done. Now onto step two. How do you get ChatGPT to translate content for you? Maybe it’s as easy as this?

curl https://api.openai.com/v1/chat/completions -H "Content-Type: application/json" -H "Authorization: Bearer API_KEY" -d '{"model": "gpt-3.5-turbo", "messages": [{"role": "user", "content": "Translate \"Welcome to Sitecore\" to Spanish"}]}'

And what does it spit back at you?

{
  "id": "chatcmpl-6wzKgWuvAkC87D8KY5mCEmo1KXJDv",
  "object": "chat.completion",
  "created": 1679516806,
  "model": "gpt-3.5-turbo-0301",
  "usage": {
    "prompt_tokens": 16,
    "completion_tokens": 9,
    "total_tokens": 25
  },
  "choices": [
    {
      "message": {
        "role": "assistant",
        "content": "\n\n\"Bienvenido a Sitecore\""
      },
      "finish_reason": "stop",
      "index": 0
    }
  ]
}

Wait…it just did it?

The Easy Button meme was busy today, sorry

Oddly enough, ChatGPT does know how to translate content. Is it perfect? Probably not. But then again, Translation Services have always been a “Trust then Verify” system for me. Get the content back and validate it doesn’t act weird. But, for as much as you’re paying for this (for this POC…nothing!) it seems pretty nice!

Onto step three! I really want a simple UI for this POC. I’d like to have the ability to do the following:

  1. Right click on an item and select “Translate”
  2. Select which fields should be translated.
  3. Select the target language
  4. Go!

This leads to a pretty simple UI. Here’s what it looks like:

Right-click context menu
The minimalist UI

This isn’t very complex looking. Again, this is just a prototype and not something you’d use on a live project (I mean you could, I’m not going to tell you how to live your life). What happens under the hood? Let’s talk about a couple settings:

Look ma, a module!

This “GPT Translate Settings” item holds the critical settings required for translation:

  1. Which fields can I actually translate. I might not want to translate everything. Things like key-value pairs for CSS…no thanks. Things like images…not today! So within this, I’ve added the Title and Text fields from the Sample Template. These are visible on the simple UI above. And yes, this will be culled down based on what’s available on the actual item you’re translating
  2. My API Key
Not complex, sorry.

Below this item, you’ll see “French” and “Spanish” items. These items tell Sitecore how to talk to ChatGPT in requesting a translation. I can’t ask chat GPT to “translate content into ‘es-MX'” as it doesn’t generally understand the nuances of ISO codes. However, you CAN tell it to translate something into “Spanish”. This section will have one item for every language. In the future, I’d probably switch this to an item with a Multilist. essentially these items translate “es-MX” into “Translate to Spanish”

Ok, so…now let’s look at the code. What steps are we following:

  1. Setup our Dialog
  2. Set our Translation Items (Language, Source, Type, Empty Target)
  3. Translate
  4. Create new Item with fields

Let’s look at the steps for the first part

#Grab the Context Item
$sourceItem = Get-Item .

#Load our Settings including Languages and Fields to Translate
$settings = Get-Item  "master:\sitecore\system\Modules\GPT Translation\GPT Translate Settings" -Language "en"

$apiKey = $settings["API Key"]

$filterFields = ([Sitecore.Data.Fields.MultilistField]$settings.Fields["Fields To Translate"]).GetItems()

$commonLanguages =  Get-ChildItem  -Item $settings -Language "en"

$fieldsOptions = @{}

$filterFields | ForEach-Object {
    $fieldsOptions.Add($_.Name, $_.ID)
}

#Grab languages from the system
$languages = Get-ChildItem  "master:\sitecore\system\Languages" -Language "en"

$languageOptions = @{}

$languages | ForEach-Object {
    $languageOptions.Add($_.Name, $_.Name)
}

#Remove Current Language
$languageOptions.Remove($SitecoreContextItem.Language.Name)

$props = @{
    Parameters = @(
        @{Name="fieldsToTranslateOption"; Editor="Checklist"; Title="Choose which fields to translate"; Options=$fieldsOptions; Tooltip="This list is configurable."}
        @{Name="languagesToTranslateOption"; Title="Choose which language to translate to."; Options=$languageOptions}

    )
    Title = "GPT Translation"
    Description = "Choose the right option."
    Width = 600
    Height = 400
    ShowHints = $true
}

$res = Read-Variable @props

There’s nothing incredibly complex about this.

  • Line 2 – Grab our current context item
  • Line 9 – All fields shouldn’t be translated. We’ve configured which ones can on our settings item
  • Line 20 – Get all system languages
  • Line 29 – It makes no sense to translate into the current language…sorry?
  • Line 44 – Show the UI dialog to the user and wait for some input

Now onto the next part. We’re creating a translation object which contains a few properties:

  • The Field ID
  • Untranslated Text
  • Field Type
  • Translated Text (empty at this point)

#We need to map our ISO Code (es-MX) to a friendly language (Spanish)
$commonLanguages | Foreach-Object {
    
    $langCode = (Get-Item $_["Language"]).Name
    
    if($langCode -eq $languagesToTranslateOption)
    {
        $commonLang = $_.Name
    }
    
}

Write-Host "API Key:" $apiKey
Write-Host "Fields to Translate:" $fieldsToTranslateOption
Write-Host "Current Language:" $SitecoreContextItem.Language.Name
Write-Host "Selected Language:" $languagesToTranslateOption
Write-Host "Selected Language (Common):" $commonLang 
Write-Host "Translating for: " $sourceItem.Name
Write-Host "Filter Count: " $filterFields.Count.ToString()

#If we didn't click OK, then it isn't OK!
if($res -ne "ok")
{
    Show-Alert "Aborted"
    Exit
}

#we need to make sure all the fields to translate exist on this item and remove the ones that don't
$filteredFields = New-Object -TypeName 'System.Collections.ArrayList'

$sourceItem.Fields | ForEach-Object {
    $sourceField = $_
    
    
    $filterFields | ForEach-Object { 
        
        if($_.ID -eq $sourceField.ID)
        {
            Write-Host "Found" $_.Name
            $filteredFields.Add($_)
        }
    }
}

Write-Host "Translatable Fields Count: " $filteredFields.Count.ToString()

#Using our fields, let's create a list of translation objects we'll iterate through
$translations = New-Object -TypeName 'System.Collections.ArrayList'

$filteredFields | ForEach-Object {

     $fieldDef = Get-Item -Path master: -ID $_.ID
    
    $translationItem = @{
        FieldID = $_.ID
        Untranslated = $sourceItem.Fields[$_.ID].Value
        Translated = ""
        FieldType = $fieldDef.Type        
        
    }
        
    $translations.Add($translationItem)
    
}

Here’s what’s going on:

  • Line 3 – We need to translate the selected language (es-MX) into “Spanish” which happens here.
  • Line 14-20 – Lol Debug.
  • Line 30 – Cull our list of fields, so we don’t try to translate something which doesn’t exist on the item.
  • Line 51 – Create our list of Translation Object from the data we’ve built

Now onto translation!

$translations | ForEach-Object {

    $translationAsk = "Text"

    #If you don't tell ChatGPT to translate something as HTML, it will break your styles (style="color: blue;" turns into style="color: azul;")
    if($_.FieldType -eq "Rich Text")
    {
        $translationAsk = "HTML"
    }
    
    $headers = @{
       Authorization = "Bearer " + $apiKey
    }
    
    #Generate our prompt
    #Translate the following Text/HTML to Spanish 'blah blah'
    $data = @{
        model = "gpt-3.5-turbo"
        messages = @(
            @{
                role = "user"
                content = "Translate the following " + $translationAsk + " to " + $commonLang + " `"" + $_.Untranslated + "`""
            }
        )
    
    }
    
    $Params = @{
        Method = "POST"
        Headers = $headers
        
        Body = $data | ConvertTo-Json
        Uri = "https://api.openai.com/v1/chat/completions"
        ContentType = "application/json; charset=utf-8"
    }
    
    $result = (Invoke-RestMethod @Params).choices[0]
    
    $_.Translated = $result.message.content
}

Within this chunk a few things happen:

  • Line 6 – You don’t need to tell ChatGPT how to translate, but the natural text processor by default isn’t aware of HTML. You don’t want to translate “style=’color: blue;'” to “style=’color: azul;'” in every language. Based on the field type here, we’re swapping it to ask to translate HTML when applicable. It works shockingly well!
  • Line 11 – Build our Auth Header
  • Line 17 – Build our Data Body. This is where we get our prompt that you could actually type into ChatGTP if you were using the web UI.
  • Line 37 – Send our request away and then parse the result content back into our object

Finally, we’re set to create our new content version:

$targetItem = Add-ItemVersion -Item $sourceItem -TargetLanguage $languagesToTranslateOption

$targetItem.Editing.BeginEdit()


$translations | ForEach-Object {

    Write-Host "Source:" $_.Untranslated
    Write-Host "Dest:" $_.Translated
    
    #we need to fix some encoding shenanigans
    $bytes = [System.Text.Encoding]::GetEncoding(1252).GetBytes($_.Translated);
    
    $fixed = [System.Text.Encoding]::UTF8.GetString($bytes).Trim();
    
    #Sometimes ChatGPT wraps our output in quotes...this nukes them
    if($fixed[0] -eq "`"" -and $fixed[$fixed.Length] -eq "`"")
    {
        $fixed = $fixed.Substring(1, $fixed.Length -1).Trim() 
    }
    
    Write-Host "Fixed:" $fixed
    
    $targetItem[$_.FieldID] = $fixed
}

$targetItem.Editing.EndEdit()

This reads like a pretty typical item editing scenario with a few fun chunks:

  • Line 1 – Add a version in the target language selected from the UI
  • Line 3 – Open for Editing once (before we loop)
  • Line 11 – This was a fun one. When you start encoding into other languages, Chat GPT can send you some pretty off-encodings. Check out this post for some deeper info on the nuances of UTF-8 encoding within Latin languages.
  • Line 24 – Update the text in the item
  • Line 27 – Save and done!

So there you have it. I’ve serialized the entire project to the following repo: https://github.com/RAhnemann/SitecoreGPTTranslate

This is NOT Me!

Misc Notes:

  • I use a base template called “GPT Translatable” and a simple Rule in PowerShell to determine where the Context Menu is available. I might not want to add translation to a Site Definition…
  • Again, this is a rough prototype. Maybe a jumping off and/or inspiration point?
  • Turns out Gabe has done this with Azure Cognitive Services…and it looks pretty similar! https://www.sitecoregabe.com/2019/01/translating-text-in-sitecore-using.html

What would I like from a future version of this?

  1. Submit the item into a workflow state so it can be checked
  2. Optionally overwrite a version with new content. ChatGPT can give you different versions back depending on how it’s feeling.
  3. Ability to edit the content before it’s put into an item (like a preview function…maybe?)
  4. Translate into multiple languages at once
  5. More advanced data cleaning (sometimes weird characters get in there and should be stripped and trimmed)
  6. Multilist language configuration
  7. Better validation and error checking
  8. A little more feedback on the UI as to what’s going on
  9. Translate from Experience Editor, Pages and/or the Content Editor bar