Regex to Match Number of Subdirectories in a URL

Comments:9

  1. Hi, thanks for writing out this article. It’s been super helpful and is almost exactly what I’m looking for.

    However, I’d like to take “Variation 1” one step further.

    With the example, Regex for exactly one sub-directory:
    ^/[^/]+/$

    This will match any top-level directory, for example: /retail/.

    I’d like to do this, plus 2 and 3 directories deeper, but ideally I want to specify exactly what that directory is. In this example, matching only the /retail/ directory, and all subsequent subdirectories.

    Would love if you’d be able to explain that!

    1. That’s great, glad to hear it was useful!

      If you want to specify a specific directory, you should be able to type it in like this:

      ^/(retail)+/[^/]+/$ (2 subdirectories, including /retail/)
      ^/(retail)+/[^/]+/[^/]+/$ (3 subdirectories, including /retail/)
      ^/(retail)+/[^/]+/[^/]+/[^/]+/$ (4 subdirectories, including /retail/)

      Please try that and let me know if there’s any issue.

      1. Hi there! Thanks so much for this article. It’s incredibly helpful. I’m trying the above without luck. In my example there is an underscore in the first subdirectory I’d like to group the content by.
        Example:
        Goal: group all content containing two sub directories after /retail_store/, beginning with /retail_store/
        ^/(retail_store)+/[^/]+/$

        Any idea what I might be doing wrong?

      2. Thank you for your comment, much appreciated!

        If you have 2 subdirectories after /retail_store/, you’ll have a total of 3 subdirectories. So for that case you should use the following:
        ^/(retail_store)+/[^/]+/[^/]+/$

        It shouldn’t matter if there’s an underscore or not. Please try it out and let me know how it goes!

  2. Thanks so much, Ana! I’m still having some issues with this.

    For context, I’m trying to group content in GA using this regex. I want to group views to two different sub-directories:
    1. https://website.com/retail_store/123
    2. http://website.com/retail_store/123/456
    Pages with exactly two subdirectories starting with retail_store is one category, and pages with exactly three subdirectories starting with retail_store is the second category. Any other advice you might have would be awesome – I so appreciate your help!

      1. Hey there! Just realized I never received the notification for this. Your suggestion works! I’ll do some reading up around trailing slashes to make sure I don’t run into this issue again in the future. 🙂

  3. Super helpful post! Do you also have something in mind for filtering a Landing Page URL based on the amount of certain symbols. E.x. filter out URLs that countain 3 or more “_” ?

    1. I think you can use the following regex to capture URLs with 3 or more underscores:
      \_.*\_.*\_
      Please check it out and let me know if that works for you!

Leave a Reply

Your email address will not be published.