Amazon Coding Question – Solved

4 Live
Given a string, how many different substrings exist in it that have no repeating characters? Two substrings are considered different if they have a different start or end index. Example s = "abac" The substrings that have no repeating characters in them are "a", "b", "a", "c", "ab", "ba", "ac", and "bac". Note that "aba" and "abac" do not qualify because the character 'a' is repeated in them. Also, note that two substrings, "a" and "a", both qualify because their start indices are different: s[0] and s[2]. There are 8 substrings that have no repeating characters. Function Description Complete the function findSubstrings in the editor below. findSubstrings has the following parameter: string s: the given string Returns int: the number of substrings in s that have no repeating characters Constraints 1 ≤ length of s ≤ 10⁵ s consists of only lowercase English letters, ascii['a'-'z']. Input Format For Custom Testing Sample Case 0 Sample Input For Custom Testing bcada

Asked in: Amazon

Image of the Question

Question Image

All Testcases Passed ✔



Passcode Image

Solution


#!/bin/python3

import math
import os
// ... rest of solution available after purchase

🔒 Please login to view the solution

Explanation


```
To approach this problem, start by carefully understanding what is being asked. You are given a string and asked to count how many substrings (not subsequences) exist such that there are no repeating characters within each substring. A substring is a contiguous sequence of characters, and two substrings are considered different if they start or end at different indices, even if the characters are the same.

Now, begin by exploring the brute-force strategy conceptually, just to get an idea of what’s involved. Imagine iterating over every possible starting index of the string, and for each start, you grow the substring one character at a time. You check whether the current substring has repeated characters. If it does, you stop growing that substring from that starting point and move to the next starting point. This would give you all substrings with unique characters. But while this helps you understand the logic, this brute-force approach would be inefficient due to the potential O(n^2) time complexity, which is too slow for a string of size up to 100,000.

Given that, the next step is to think about how to optimize this. You want a method that avoids redundant work and allows you to examine substrings efficiently without checking each one individually. This is where you can think in terms of sliding window techniques.

Sliding window approaches are powerful for problems where you're dealing with substrings or intervals, especially when trying to maintain a certain property (in this case, uniqueness of characters). Imagine maintaining a window [left, right] that contains only unique characters. You can use a data structure like a set to keep track of the characters currently in the window.

You begin with both pointers at the start of the string. As you move the right pointer one step at a time, you check if the character at right is already in your set. If it’s not, then the window is valid, and you can calculate how many new substrings this extended window contributes. Specifically, every time you add a character to the window and maintain uniqueness, the number of substrings ending at the current right and starting anywhere from left to right is right - left + 1. You add this to your total count.

However, if you find a duplicate character, the window is no longer valid. To fix this, you move the left pointer forward and remove characters from the set until the duplicate is gone. You repeat this process until you’ve scanned the entire string.

The key is recognizing that each character is added and removed from the sliding window at most once, so the total operations remain linear with respect to the size of the string. This approach gives you a much more efficient solution.

Conceptually, you're leveraging the idea that every time you expand your window to the right and still have a valid window, you're effectively discovering several new substrings (equal to the length of the current window). These substrings are guaranteed to be unique because the window itself contains all unique characters. Each time you find a duplicate, you just slide the left pointer until your window is valid again.

To structure your thought process further:
- Understand what makes a substring valid (no repeating characters).
- Use two pointers to define a window over the string.
- Maintain a data structure (like a set) that tracks characters in the current window.
- Expand the right pointer to explore new characters.
- When a duplicate is encountered, shift the left pointer to eliminate the duplicate.
- For every step where the window is valid, count the number of valid substrings ending at that index.

In summary, you’re looking to reduce the number of operations by avoiding a brute-force enumeration of all substrings. By using a dynamic window that moves only when necessary and maintaining an efficient structure to track character presence, you can ensure you process the string in linear time. Keep your focus on counting how many new valid substrings are formed as you move the right pointer, and adjust the left pointer only when needed to maintain the constraint of no repeating characters.
```


Related Questions