Tableau’s REGEX and ChatGPT

I have forever found REGEX to be a tough nut to crack. The syntax always confused me and on the rare occasion, I got the syntax right the results were wrong. At work, someone asked about extracting a consecutive 4-digit number from a string and for some reason my brain went “I wonder if ChatGPT can generate a REGEX statement for Tableau.

TL;DR – Yes, it can…

I tested three use cases that I have heard for REGEX, and I think I’m pretty happy with the results.
The first was a typical Zip Code extraction, where I asked:
Generate a Tableau REGEX expression to extract the postal code in an address string

In return, I got the following:
Assuming that the postal code follows the typical format of 5 digits (in the US), you could use the following Tableau REGEX expression to extract the postal code from an address string:
“`
REGEXP_EXTRACT([Address],’\\b\\d{5}\\b’)
“`
This expression looks for a sequence of exactly 5 digits surrounded by word boundaries (`\b`). This will match any US postal code in the address string.

If you are working with addresses from a different country, you may need to adjust the regular expression to match the format of postal codes in that country.

I liked that it also explained the methodology around the syntax because in the event i had to edit it, I at least have something to go on in my next set of searches or AI questions.

The second use case was around parsing a URL, where i asked:

Generate a Tableau REGEX statement to parse the URL for a marketing campaign

and in return, I got two examples!
Assuming that the URL for the marketing campaign includes a set of parameters (i.e. key-value pairs) after the question mark (`?`), you could use the following Tableau REGEX statement to parse the parameters and extract the value associated with a specific key:
“`
REGEXP_EXTRACT([URL],'[?&]key=([^&]*)’,1)
“`
This expression looks for a parameter with the key `key` (replace `key` with the actual key name) and extracts the value associated with that key. The `[^&]*` matches any sequence of characters that is not an ampersand (`&`). The `( )` captures the value as a group, which can be referred to as `1`. The `[?&]` ensures that the regular expression only matches parameters that are preceded by a question mark or an ampersand.

To use this expression, replace `[URL]` with the name of the field that contains the campaign URL, and replace `key` with the actual key name that you want to extract the value for. For example, if your campaign URL is `https://example.com/?utm_source=google&utm_medium=cpc&utm_campaign=spring_sale&discount_code=SPRING25`, and you want to extract the value for the `discount_code` parameter, the expression would be:
“`
REGEXP_EXTRACT([URL],'[?&]discount_code=([^&]*)’,1)
“`
This would extract the value `SPRING25` as a string.

The last one took my original query from a co-worker to the next level. The initial ask was how to extract a 4-digit consecutive number from as string, but now I upped the level of difficulty by asking:
Generate a Tableau REGEX statement that extracts the 3rd occurrence of a 3-digits consecutive number

…and ChatGPT delivered
You can use the following Tableau REGEX statement to extract the 3rd occurrence of a 3-digit consecutive number in a string:
“`
REGEXP_EXTRACT([string],'(?:\d{3}\D*){3}\K\d{3}’)
“`
Here, `[string]` is the name of the field that contains the string you want to extract the pattern from.

This expression uses a non-capturing group `(?:\d{3}\D*){3}` to find the third occurrence of a sequence of three digits followed by any non-digit characters (i.e. a non-capturing group is used as we don’t need to capture the content of this group). The `\K` is used to reset the match and exclude the previous matched groups. The final `\d{3}` captures the three consecutive digits that follow.

For example, if your string is “ABC123DEFG456HI789JKL”, this expression will return “789” as it is the third occurrence of a 3-digit consecutive number.

Have you tried it? Let me know how it worked for you!

3 thoughts on “Tableau’s REGEX and ChatGPT

  1. Great approach Mark!
    I feel like I am in the same boat unterstanding the power of REGEX but unsure about the exact syntax…
    ChatGPT solves a lot and for sure provides a great headstart to answer almost any questions.
    I like to add:
    Learning by doing helped my the most – for this I find https://regex101.com/ very helpful.
    You enter a REGEX and one or many input texts. I parses and explains the syntax as well as displays the results.
    To consider: There might be some issues using different variants of REGEX.

    And using REGEX in Tableau I always complete my work with a performance analysis. You never know if the REGEX solution turns out to be heavy at runtime.

  2. Chat GTP is getting better.! This prompt produced an unusable result two weeks ago, but rerunning it I got a RegEx that works perfectly! https://chat.openai.com/share/5e1af283-4eb9-4dd9-91f6-a153e909ac2f

    Take the role of a tableau desktop developer and tell me how to return only the English text in this string “Bust of the Mother of God orans, the medallion of Christ before her. Sigla: Μ̅Θ̅ : Μ(ήτηρ) Θ(εοῦ). Border of dots.”

Leave a Reply

Your email address will not be published. Required fields are marked *