.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty margin-bottom:0;

up vote
6
down vote

favorite

The problem is the following: I have a list intervals which consists of tuples of the form (start, end) [with start <= end]. Each tuple represents an interval (of the real line). We assume that the intervals in intervals are not overlapping each other. Given a new interval (s,e), I would like to write a Python function which checks if (s, e) overlaps any of the intervals in intervals. If (s, e) has a non-empty intersection with at least one of the intervals in intervals, the function should return the indices of these intervals in the list intervals.

Say that the function is called find_intersections. Then, given that intervals = [(1, 3.5), (5.5, 8.7), (10.2, 22.6), (22.7, 23.1)], expected outputs would be:

find_intersection(intervals, (3.2, 5.)) returns array([0])

find_intersection(intervals, (6.1, 7.3)) returns array([1])

find_intersection(intervals, (9.1, 10.2)) returns No intersection.

find_intersection(intervals, (5.8, 22.9)) returns array([1, 2, 3]).

The code for find_intersection I have written is:

import itertools

def find_intersection(intervals, new_interval):
 _times = sorted(list(itertools.chain.from_iterable(intervals)))
 ind = np.searchsorted(_times, np.asarray(new_interval))
 parity = np.mod(ind, 2)
 if (not np.any(parity)) and ind[1] == ind[0]:
 print('No intersection.')
 elif parity[0] == 1:
 ub = ind[1] if parity[1] == 1 else ind[1] - 1
 return np.arange((ind[0] - 1) / 2, (ub - 1) / 2 + 1)
 elif parity[1] == 1:
 lb = ind[0] if parity[0] == 1 else ind[0] + 1
 return np.arange((lb - 1) / 2, (ind[1] - 1) / 2 + 1)
 else:
 lb = ind[0] if parity[0] == 1 else ind[0] + 1
 ub = ind[1] if parity[1] == 1 else ind[1] - 1
 return np.arange((lb - 1) / 2, (ub - 1) / 2 + 1)

I believe that the code does the job.

Is there an easier/smarter way to address this problem?

Edit: this question was cross-posted on stackoverflow.com.

edited 2 days ago

asked 2 days ago

jibounet

1334

New contributor

Hello and Welcome to Code Review Stack Exchange. You have a good question here and it's sad to see that it is deleted. I hope that you will bring it back, I know someone who wants to answer your question.
â€“Â Simon Forsbergâ™¦
2 days ago

@SimonForsberg : Thank you for your interest in my question. I deleted the question as I thought it was more relevant on stackoverflow.com. Now that it was also asked on stackoverflow.com, not deleting the question would result in cross-posting.
â€“Â jibounet
2 days ago

2

@jibounet, This question appears to be On-topic for Code Review, I would leave it be and wait for the answer here. feel free to post a link to the Stack Overflow question here in the comments with a brief note that it has been cross posted.
â€“Â Malachiâ™¦
2 days ago

@jibounet I see that you have received answers on both Stack Overflow and Code Review, as you might see the answer styles is a bit different. Just out of curiosity, which do you like more? (I promise, I won't be offended if you answer Stack Overflow)
â€“Â Simon Forsbergâ™¦
2 days ago

add a commentÂ |Â

up vote
6
down vote

favorite

Say that the function is called find_intersections. Then, given that intervals = [(1, 3.5), (5.5, 8.7), (10.2, 22.6), (22.7, 23.1)], expected outputs would be:

find_intersection(intervals, (3.2, 5.)) returns array([0])

find_intersection(intervals, (6.1, 7.3)) returns array([1])

find_intersection(intervals, (9.1, 10.2)) returns No intersection.

find_intersection(intervals, (5.8, 22.9)) returns array([1, 2, 3]).

The code for find_intersection I have written is:

import itertools

def find_intersection(intervals, new_interval):
 _times = sorted(list(itertools.chain.from_iterable(intervals)))
 ind = np.searchsorted(_times, np.asarray(new_interval))
 parity = np.mod(ind, 2)
 if (not np.any(parity)) and ind[1] == ind[0]:
 print('No intersection.')
 elif parity[0] == 1:
 ub = ind[1] if parity[1] == 1 else ind[1] - 1
 return np.arange((ind[0] - 1) / 2, (ub - 1) / 2 + 1)
 elif parity[1] == 1:
 lb = ind[0] if parity[0] == 1 else ind[0] + 1
 return np.arange((lb - 1) / 2, (ind[1] - 1) / 2 + 1)
 else:
 lb = ind[0] if parity[0] == 1 else ind[0] + 1
 ub = ind[1] if parity[1] == 1 else ind[1] - 1
 return np.arange((lb - 1) / 2, (ub - 1) / 2 + 1)

I believe that the code does the job.

Is there an easier/smarter way to address this problem?

Edit: this question was cross-posted on stackoverflow.com.

edited 2 days ago

asked 2 days ago

jibounet

1334

New contributor

Hello and Welcome to Code Review Stack Exchange. You have a good question here and it's sad to see that it is deleted. I hope that you will bring it back, I know someone who wants to answer your question.
â€“Â Simon Forsbergâ™¦
2 days ago

@SimonForsberg : Thank you for your interest in my question. I deleted the question as I thought it was more relevant on stackoverflow.com. Now that it was also asked on stackoverflow.com, not deleting the question would result in cross-posting.
â€“Â jibounet
2 days ago

2

@jibounet, This question appears to be On-topic for Code Review, I would leave it be and wait for the answer here. feel free to post a link to the Stack Overflow question here in the comments with a brief note that it has been cross posted.
â€“Â Malachiâ™¦
2 days ago

@jibounet I see that you have received answers on both Stack Overflow and Code Review, as you might see the answer styles is a bit different. Just out of curiosity, which do you like more? (I promise, I won't be offended if you answer Stack Overflow)
â€“Â Simon Forsbergâ™¦
2 days ago

add a commentÂ |Â

up vote
6
down vote

favorite

Say that the function is called find_intersections. Then, given that intervals = [(1, 3.5), (5.5, 8.7), (10.2, 22.6), (22.7, 23.1)], expected outputs would be:

find_intersection(intervals, (3.2, 5.)) returns array([0])

find_intersection(intervals, (6.1, 7.3)) returns array([1])

find_intersection(intervals, (9.1, 10.2)) returns No intersection.

find_intersection(intervals, (5.8, 22.9)) returns array([1, 2, 3]).

The code for find_intersection I have written is:

import itertools

def find_intersection(intervals, new_interval):
 _times = sorted(list(itertools.chain.from_iterable(intervals)))
 ind = np.searchsorted(_times, np.asarray(new_interval))
 parity = np.mod(ind, 2)
 if (not np.any(parity)) and ind[1] == ind[0]:
 print('No intersection.')
 elif parity[0] == 1:
 ub = ind[1] if parity[1] == 1 else ind[1] - 1
 return np.arange((ind[0] - 1) / 2, (ub - 1) / 2 + 1)
 elif parity[1] == 1:
 lb = ind[0] if parity[0] == 1 else ind[0] + 1
 return np.arange((lb - 1) / 2, (ind[1] - 1) / 2 + 1)
 else:
 lb = ind[0] if parity[0] == 1 else ind[0] + 1
 ub = ind[1] if parity[1] == 1 else ind[1] - 1
 return np.arange((lb - 1) / 2, (ub - 1) / 2 + 1)

I believe that the code does the job.

Is there an easier/smarter way to address this problem?

Edit: this question was cross-posted on stackoverflow.com.

edited 2 days ago

asked 2 days ago

jibounet

1334

New contributor

Say that the function is called find_intersections. Then, given that intervals = [(1, 3.5), (5.5, 8.7), (10.2, 22.6), (22.7, 23.1)], expected outputs would be:

find_intersection(intervals, (3.2, 5.)) returns array([0])

find_intersection(intervals, (6.1, 7.3)) returns array([1])

find_intersection(intervals, (9.1, 10.2)) returns No intersection.

find_intersection(intervals, (5.8, 22.9)) returns array([1, 2, 3]).

The code for find_intersection I have written is:

import itertools

def find_intersection(intervals, new_interval):
 _times = sorted(list(itertools.chain.from_iterable(intervals)))
 ind = np.searchsorted(_times, np.asarray(new_interval))
 parity = np.mod(ind, 2)
 if (not np.any(parity)) and ind[1] == ind[0]:
 print('No intersection.')
 elif parity[0] == 1:
 ub = ind[1] if parity[1] == 1 else ind[1] - 1
 return np.arange((ind[0] - 1) / 2, (ub - 1) / 2 + 1)
 elif parity[1] == 1:
 lb = ind[0] if parity[0] == 1 else ind[0] + 1
 return np.arange((lb - 1) / 2, (ind[1] - 1) / 2 + 1)
 else:
 lb = ind[0] if parity[0] == 1 else ind[0] + 1
 ub = ind[1] if parity[1] == 1 else ind[1] - 1
 return np.arange((lb - 1) / 2, (ub - 1) / 2 + 1)

I believe that the code does the job.

Is there an easier/smarter way to address this problem?

Edit: this question was cross-posted on stackoverflow.com.

python sorting numpy interval

edited 2 days ago

asked 2 days ago

jibounet

1334

New contributor

edited 2 days ago

asked 2 days ago

jibounet

1334

New contributor

edited 2 days ago

asked 2 days ago

jibounet

1334

New contributor

asked 2 days ago

jibounet

1334

asked 2 days ago

jibounet

1334

New contributor

jibounet is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.

Hello and Welcome to Code Review Stack Exchange. You have a good question here and it's sad to see that it is deleted. I hope that you will bring it back, I know someone who wants to answer your question.
â€“Â Simon Forsbergâ™¦
2 days ago

@SimonForsberg : Thank you for your interest in my question. I deleted the question as I thought it was more relevant on stackoverflow.com. Now that it was also asked on stackoverflow.com, not deleting the question would result in cross-posting.
â€“Â jibounet
2 days ago

2

@jibounet, This question appears to be On-topic for Code Review, I would leave it be and wait for the answer here. feel free to post a link to the Stack Overflow question here in the comments with a brief note that it has been cross posted.
â€“Â Malachiâ™¦
2 days ago

@jibounet I see that you have received answers on both Stack Overflow and Code Review, as you might see the answer styles is a bit different. Just out of curiosity, which do you like more? (I promise, I won't be offended if you answer Stack Overflow)
â€“Â Simon Forsbergâ™¦
2 days ago

add a commentÂ |Â

Hello and Welcome to Code Review Stack Exchange. You have a good question here and it's sad to see that it is deleted. I hope that you will bring it back, I know someone who wants to answer your question.
â€“Â Simon Forsbergâ™¦
2 days ago

@SimonForsberg : Thank you for your interest in my question. I deleted the question as I thought it was more relevant on stackoverflow.com. Now that it was also asked on stackoverflow.com, not deleting the question would result in cross-posting.
â€“Â jibounet
2 days ago

2

@jibounet, This question appears to be On-topic for Code Review, I would leave it be and wait for the answer here. feel free to post a link to the Stack Overflow question here in the comments with a brief note that it has been cross posted.
â€“Â Malachiâ™¦
2 days ago

@jibounet I see that you have received answers on both Stack Overflow and Code Review, as you might see the answer styles is a bit different. Just out of curiosity, which do you like more? (I promise, I won't be offended if you answer Stack Overflow)
â€“Â Simon Forsbergâ™¦
2 days ago

Hello and Welcome to Code Review Stack Exchange. You have a good question here and it's sad to see that it is deleted. I hope that you will bring it back, I know someone who wants to answer your question.
â€“Â Simon Forsbergâ™¦
2 days ago

@SimonForsberg : Thank you for your interest in my question. I deleted the question as I thought it was more relevant on stackoverflow.com. Now that it was also asked on stackoverflow.com, not deleting the question would result in cross-posting.
â€“Â jibounet
2 days ago

@jibounet, This question appears to be On-topic for Code Review, I would leave it be and wait for the answer here. feel free to post a link to the Stack Overflow question here in the comments with a brief note that it has been cross posted.
â€“Â Malachiâ™¦
2 days ago

@jibounet I see that you have received answers on both Stack Overflow and Code Review, as you might see the answer styles is a bit different. Just out of curiosity, which do you like more? (I promise, I won't be offended if you answer Stack Overflow)
â€“Â Simon Forsbergâ™¦
2 days ago

add a commentÂ |Â

2 Answers
2

active

oldest

votes

up vote
6
down vote

accepted

1. Review

There is no docstring. What does the function find_intersections do? The text in the post is pretty clear, so you could use this as the basis of a docstring. I would make it explicit that the intervals are open (if that's what you really intend), since in Python intervals are normally half-open.

There seems to be some uncertainty about whether the input intervals is supposed to be already sorted. On the one hand, if it's already sorted then there's no point in calling sorted. But on the other, if it's not sorted then returning a range of indexes is not very useful, as these are indexes into _times, not into intervals.

I recommend picking one possibility or the other: if intervals is supposed to be sorted, then don't sort it (you might assert that it is already sorted); but if not, then translate the returned indexes so that they refer to intervals.

If there is no intersection, find_intersections prints a message. But if the caller is already printing some output, it would be inconvenient (at best) to have this kind of message appear in the middle of the output. Better to leave the printing of messages (or not) up to the caller.

If there is an intersection, find_intersections returns a range of indexes. But if there are no intersection, find_intersections returns None. It is not a good idea to return an exceptional value like this, because it would be easy for the caller to forget to check for the exceptional value, and to write something like this:
```
result = find_intersection(intervals, new_interval)
print(f"Number of intersections = len(result).")
```
which would fail with:
```
TypeError: object of type 'NoneType' has no len()
```
What they should have written is:
```
result = find_intersection(intervals, new_interval)
if result is None:
 print("Number of intersections = 0.")
else:
 print(f"Number of intersections = len(result).")
```
but this is very long-winded. In this case a simple improvement would be to return an empty range of indexes.

If you're going to use NumPy, then you might as well use it wherever possible. So instead of itertools.chain.from_iterable, use numpy.reshape; instead of sorted, use numpy.sort; and so on.

2. Revised code if intervals are unsorted

If intervals is unsorted (as suggested by the call to sorted), then there's no point in sorting it. Sorting costs $O(nlog n)$ whereas comparing the query interval with every element of intervals only costs $O(n)$.

So in this case we can take advantage of the fact that the open interval $(a, b)$ overlaps with the open interval $(c, d)$ if and only if $a < d$ and $c < b$, and write:

def find_intersection(intervals, query):
 """Find intersections between intervals.
 Intervals are open and are represented as pairs (lower bound,
 upper bound).

 Arguments:
 intervals: array_like, shape=(N, 2) -- Array of intervals.
 query: array_like, shape=(2,) -- Interval to query.

 Returns:
 Array of indexes of intervals that overlap with query.

 """
 intervals = np.asarray(intervals)
 lower, upper = query
 return np.argwhere((lower < intervals[:, 1]) & (intervals[:, 0] < upper))

3. Revised code if intervals are sorted

If intervals is already sorted, as suggested by returning indexes into the sorted array, then take advantage of the side argument to numpy.searchsorted and the floor division operator, //:

def find_intersection(intervals, query):
 """Find intersections between intervals.
 Intervals are open and are represented as pairs (lower bound,
 upper bound).

 Arguments:
 intervals: array_like, shape=(N, 2) -- Array of intervals.
 query: array_like, shape=(2,) -- Interval to query.

 Returns:
 Array of indexes of intervals that overlap with query.

 """
 endpoints = np.reshape(intervals, -1)
 lower, upper = query
 i = np.searchsorted(endpoints, lower, side='right')
 j = np.searchsorted(endpoints, upper, side='left')
 return np.arange(i // 2, (j + 1) // 2)

answered 2 days ago

Gareth Rees

41.8k394168

add a commentÂ |Â

up vote
5
down vote

return value

In case there is no overlap, it would be cleaner to either return a sentinel value (None for example), or raise an exception, and don't trigger any side effects, like the print

variable names

ind, lb and ub are not clear variable names. you can use index, upper_bound and lower_bound. In this age of IDE's with code completion, there is no need to abbreviate the variable names like that

algorithm

In your algorithm you flatten intervals. There is no need for this, and you can do it a lot simpler if you let it be 2-dimensional

naive

If you use numpy anyway, then your algorithm can be made a lot simpler.

intervals = np.array(intervals)

Then you can just use comparison to check whether there is overlap

smaller = np.array(intervals[:,0]) <= new_interval[1]
larger = np.array(intervals[:,1]) > new_interval[0]

This doesn't use the fact that intervals is, or can be, sorted, but comparison in numpy is generally rather quick, and probably faster than sorting.

You have an intersection between intervals a and b when a[0] < b[1] and a[1] > b[0]

To get the indices of the overlapping intervals you can use:

intersection = smaller & larger

if not intersection.any():
 return None # or: raise(Exception('No intersection')
return np.argwhere(intersection)


return np.argwhere(smaller & larger)

sorted

If you want to use the fact the intervals is sorted, you can use np.searchsorted, instead of the direct comparison

intervals = np.sort(intervals, axis=0)
lower_index = np.searchsorted(intervals[:,1], new_interval[0])
upper_index = np.searchsorted(intervals[:,0], new_interval[1])

If upper_index > lower_index, there is an overlap

if upper_index <= lower_index:
 return None
return intervals[lower_index: upper_index]

if you need the indices, you can do return lower_index, upper_index or return np.arange(lower_index, upper_index). NB, this is on the sorted intervals

answered 2 days ago

Maarten FabrÃ©

3,504214

add a commentÂ |Â

Your Answer

StackExchange.ifUsing("editor", function ()
return StackExchange.using("mathjaxEditing", function ()
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["\$", "\$"]]);
);
);
, "mathjax-editing");

StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "196"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
convertImagesToLinks: false,
noModals: false,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

jibounet is a new contributor. Be nice, and check out our Code of Conduct.

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f203468%2ffind-the-intervals-which-have-a-non-empty-intersection-with-a-given-interval%23new-answer', 'question_page');

);

Post as a guest

Name

2 Answers
2

active

oldest

votes

2 Answers
2

active

oldest

votes

up vote
6
down vote

accepted

1. Review

There is no docstring. What does the function find_intersections do? The text in the post is pretty clear, so you could use this as the basis of a docstring. I would make it explicit that the intervals are open (if that's what you really intend), since in Python intervals are normally half-open.

There seems to be some uncertainty about whether the input intervals is supposed to be already sorted. On the one hand, if it's already sorted then there's no point in calling sorted. But on the other, if it's not sorted then returning a range of indexes is not very useful, as these are indexes into _times, not into intervals.

I recommend picking one possibility or the other: if intervals is supposed to be sorted, then don't sort it (you might assert that it is already sorted); but if not, then translate the returned indexes so that they refer to intervals.

If there is no intersection, find_intersections prints a message. But if the caller is already printing some output, it would be inconvenient (at best) to have this kind of message appear in the middle of the output. Better to leave the printing of messages (or not) up to the caller.

If there is an intersection, find_intersections returns a range of indexes. But if there are no intersection, find_intersections returns None. It is not a good idea to return an exceptional value like this, because it would be easy for the caller to forget to check for the exceptional value, and to write something like this:
```
result = find_intersection(intervals, new_interval)
print(f"Number of intersections = len(result).")
```
which would fail with:
```
TypeError: object of type 'NoneType' has no len()
```
What they should have written is:
```
result = find_intersection(intervals, new_interval)
if result is None:
 print("Number of intersections = 0.")
else:
 print(f"Number of intersections = len(result).")
```
but this is very long-winded. In this case a simple improvement would be to return an empty range of indexes.

If you're going to use NumPy, then you might as well use it wherever possible. So instead of itertools.chain.from_iterable, use numpy.reshape; instead of sorted, use numpy.sort; and so on.

2. Revised code if intervals are unsorted

So in this case we can take advantage of the fact that the open interval $(a, b)$ overlaps with the open interval $(c, d)$ if and only if $a < d$ and $c < b$, and write:

def find_intersection(intervals, query):
 """Find intersections between intervals.
 Intervals are open and are represented as pairs (lower bound,
 upper bound).

 Arguments:
 intervals: array_like, shape=(N, 2) -- Array of intervals.
 query: array_like, shape=(2,) -- Interval to query.

 Returns:
 Array of indexes of intervals that overlap with query.

 """
 intervals = np.asarray(intervals)
 lower, upper = query
 return np.argwhere((lower < intervals[:, 1]) & (intervals[:, 0] < upper))

3. Revised code if intervals are sorted

If intervals is already sorted, as suggested by returning indexes into the sorted array, then take advantage of the side argument to numpy.searchsorted and the floor division operator, //:

def find_intersection(intervals, query):
 """Find intersections between intervals.
 Intervals are open and are represented as pairs (lower bound,
 upper bound).

 Arguments:
 intervals: array_like, shape=(N, 2) -- Array of intervals.
 query: array_like, shape=(2,) -- Interval to query.

 Returns:
 Array of indexes of intervals that overlap with query.

 """
 endpoints = np.reshape(intervals, -1)
 lower, upper = query
 i = np.searchsorted(endpoints, lower, side='right')
 j = np.searchsorted(endpoints, upper, side='left')
 return np.arange(i // 2, (j + 1) // 2)

answered 2 days ago

Gareth Rees

41.8k394168

add a commentÂ |Â

up vote
6
down vote

accepted

1. Review

There is no docstring. What does the function find_intersections do? The text in the post is pretty clear, so you could use this as the basis of a docstring. I would make it explicit that the intervals are open (if that's what you really intend), since in Python intervals are normally half-open.

There seems to be some uncertainty about whether the input intervals is supposed to be already sorted. On the one hand, if it's already sorted then there's no point in calling sorted. But on the other, if it's not sorted then returning a range of indexes is not very useful, as these are indexes into _times, not into intervals.

I recommend picking one possibility or the other: if intervals is supposed to be sorted, then don't sort it (you might assert that it is already sorted); but if not, then translate the returned indexes so that they refer to intervals.

If there is no intersection, find_intersections prints a message. But if the caller is already printing some output, it would be inconvenient (at best) to have this kind of message appear in the middle of the output. Better to leave the printing of messages (or not) up to the caller.

If there is an intersection, find_intersections returns a range of indexes. But if there are no intersection, find_intersections returns None. It is not a good idea to return an exceptional value like this, because it would be easy for the caller to forget to check for the exceptional value, and to write something like this:
```
result = find_intersection(intervals, new_interval)
print(f"Number of intersections = len(result).")
```
which would fail with:
```
TypeError: object of type 'NoneType' has no len()
```
What they should have written is:
```
result = find_intersection(intervals, new_interval)
if result is None:
 print("Number of intersections = 0.")
else:
 print(f"Number of intersections = len(result).")
```
but this is very long-winded. In this case a simple improvement would be to return an empty range of indexes.

If you're going to use NumPy, then you might as well use it wherever possible. So instead of itertools.chain.from_iterable, use numpy.reshape; instead of sorted, use numpy.sort; and so on.

2. Revised code if intervals are unsorted

So in this case we can take advantage of the fact that the open interval $(a, b)$ overlaps with the open interval $(c, d)$ if and only if $a < d$ and $c < b$, and write:

def find_intersection(intervals, query):
 """Find intersections between intervals.
 Intervals are open and are represented as pairs (lower bound,
 upper bound).

 Arguments:
 intervals: array_like, shape=(N, 2) -- Array of intervals.
 query: array_like, shape=(2,) -- Interval to query.

 Returns:
 Array of indexes of intervals that overlap with query.

 """
 intervals = np.asarray(intervals)
 lower, upper = query
 return np.argwhere((lower < intervals[:, 1]) & (intervals[:, 0] < upper))

3. Revised code if intervals are sorted

If intervals is already sorted, as suggested by returning indexes into the sorted array, then take advantage of the side argument to numpy.searchsorted and the floor division operator, //:

def find_intersection(intervals, query):
 """Find intersections between intervals.
 Intervals are open and are represented as pairs (lower bound,
 upper bound).

 Arguments:
 intervals: array_like, shape=(N, 2) -- Array of intervals.
 query: array_like, shape=(2,) -- Interval to query.

 Returns:
 Array of indexes of intervals that overlap with query.

 """
 endpoints = np.reshape(intervals, -1)
 lower, upper = query
 i = np.searchsorted(endpoints, lower, side='right')
 j = np.searchsorted(endpoints, upper, side='left')
 return np.arange(i // 2, (j + 1) // 2)

answered 2 days ago

Gareth Rees

41.8k394168

add a commentÂ |Â

up vote
6
down vote

accepted

1. Review

There is no docstring. What does the function find_intersections do? The text in the post is pretty clear, so you could use this as the basis of a docstring. I would make it explicit that the intervals are open (if that's what you really intend), since in Python intervals are normally half-open.

There seems to be some uncertainty about whether the input intervals is supposed to be already sorted. On the one hand, if it's already sorted then there's no point in calling sorted. But on the other, if it's not sorted then returning a range of indexes is not very useful, as these are indexes into _times, not into intervals.

I recommend picking one possibility or the other: if intervals is supposed to be sorted, then don't sort it (you might assert that it is already sorted); but if not, then translate the returned indexes so that they refer to intervals.

If there is no intersection, find_intersections prints a message. But if the caller is already printing some output, it would be inconvenient (at best) to have this kind of message appear in the middle of the output. Better to leave the printing of messages (or not) up to the caller.

If there is an intersection, find_intersections returns a range of indexes. But if there are no intersection, find_intersections returns None. It is not a good idea to return an exceptional value like this, because it would be easy for the caller to forget to check for the exceptional value, and to write something like this:
```
result = find_intersection(intervals, new_interval)
print(f"Number of intersections = len(result).")
```
which would fail with:
```
TypeError: object of type 'NoneType' has no len()
```
What they should have written is:
```
result = find_intersection(intervals, new_interval)
if result is None:
 print("Number of intersections = 0.")
else:
 print(f"Number of intersections = len(result).")
```
but this is very long-winded. In this case a simple improvement would be to return an empty range of indexes.

If you're going to use NumPy, then you might as well use it wherever possible. So instead of itertools.chain.from_iterable, use numpy.reshape; instead of sorted, use numpy.sort; and so on.

2. Revised code if intervals are unsorted

So in this case we can take advantage of the fact that the open interval $(a, b)$ overlaps with the open interval $(c, d)$ if and only if $a < d$ and $c < b$, and write:

def find_intersection(intervals, query):
 """Find intersections between intervals.
 Intervals are open and are represented as pairs (lower bound,
 upper bound).

 Arguments:
 intervals: array_like, shape=(N, 2) -- Array of intervals.
 query: array_like, shape=(2,) -- Interval to query.

 Returns:
 Array of indexes of intervals that overlap with query.

 """
 intervals = np.asarray(intervals)
 lower, upper = query
 return np.argwhere((lower < intervals[:, 1]) & (intervals[:, 0] < upper))

3. Revised code if intervals are sorted

If intervals is already sorted, as suggested by returning indexes into the sorted array, then take advantage of the side argument to numpy.searchsorted and the floor division operator, //:

def find_intersection(intervals, query):
 """Find intersections between intervals.
 Intervals are open and are represented as pairs (lower bound,
 upper bound).

 Arguments:
 intervals: array_like, shape=(N, 2) -- Array of intervals.
 query: array_like, shape=(2,) -- Interval to query.

 Returns:
 Array of indexes of intervals that overlap with query.

 """
 endpoints = np.reshape(intervals, -1)
 lower, upper = query
 i = np.searchsorted(endpoints, lower, side='right')
 j = np.searchsorted(endpoints, upper, side='left')
 return np.arange(i // 2, (j + 1) // 2)

answered 2 days ago

Gareth Rees

41.8k394168

1. Review

There is no docstring. What does the function find_intersections do? The text in the post is pretty clear, so you could use this as the basis of a docstring. I would make it explicit that the intervals are open (if that's what you really intend), since in Python intervals are normally half-open.

There seems to be some uncertainty about whether the input intervals is supposed to be already sorted. On the one hand, if it's already sorted then there's no point in calling sorted. But on the other, if it's not sorted then returning a range of indexes is not very useful, as these are indexes into _times, not into intervals.

I recommend picking one possibility or the other: if intervals is supposed to be sorted, then don't sort it (you might assert that it is already sorted); but if not, then translate the returned indexes so that they refer to intervals.

If there is no intersection, find_intersections prints a message. But if the caller is already printing some output, it would be inconvenient (at best) to have this kind of message appear in the middle of the output. Better to leave the printing of messages (or not) up to the caller.

If there is an intersection, find_intersections returns a range of indexes. But if there are no intersection, find_intersections returns None. It is not a good idea to return an exceptional value like this, because it would be easy for the caller to forget to check for the exceptional value, and to write something like this:
```
result = find_intersection(intervals, new_interval)
print(f"Number of intersections = len(result).")
```
which would fail with:
```
TypeError: object of type 'NoneType' has no len()
```
What they should have written is:
```
result = find_intersection(intervals, new_interval)
if result is None:
 print("Number of intersections = 0.")
else:
 print(f"Number of intersections = len(result).")
```
but this is very long-winded. In this case a simple improvement would be to return an empty range of indexes.

If you're going to use NumPy, then you might as well use it wherever possible. So instead of itertools.chain.from_iterable, use numpy.reshape; instead of sorted, use numpy.sort; and so on.

2. Revised code if intervals are unsorted

So in this case we can take advantage of the fact that the open interval $(a, b)$ overlaps with the open interval $(c, d)$ if and only if $a < d$ and $c < b$, and write:

def find_intersection(intervals, query):
 """Find intersections between intervals.
 Intervals are open and are represented as pairs (lower bound,
 upper bound).

 Arguments:
 intervals: array_like, shape=(N, 2) -- Array of intervals.
 query: array_like, shape=(2,) -- Interval to query.

 Returns:
 Array of indexes of intervals that overlap with query.

 """
 intervals = np.asarray(intervals)
 lower, upper = query
 return np.argwhere((lower < intervals[:, 1]) & (intervals[:, 0] < upper))

3. Revised code if intervals are sorted

If intervals is already sorted, as suggested by returning indexes into the sorted array, then take advantage of the side argument to numpy.searchsorted and the floor division operator, //:

def find_intersection(intervals, query):
 """Find intersections between intervals.
 Intervals are open and are represented as pairs (lower bound,
 upper bound).

 Arguments:
 intervals: array_like, shape=(N, 2) -- Array of intervals.
 query: array_like, shape=(2,) -- Interval to query.

 Returns:
 Array of indexes of intervals that overlap with query.

 """
 endpoints = np.reshape(intervals, -1)
 lower, upper = query
 i = np.searchsorted(endpoints, lower, side='right')
 j = np.searchsorted(endpoints, upper, side='left')
 return np.arange(i // 2, (j + 1) // 2)

answered 2 days ago

Gareth Rees

41.8k394168

answered 2 days ago

Gareth Rees

41.8k394168

answered 2 days ago

Gareth Rees

41.8k394168

answered 2 days ago

Gareth Rees

41.8k394168

add a commentÂ |Â

up vote
5
down vote

return value

In case there is no overlap, it would be cleaner to either return a sentinel value (None for example), or raise an exception, and don't trigger any side effects, like the print

variable names

algorithm

In your algorithm you flatten intervals. There is no need for this, and you can do it a lot simpler if you let it be 2-dimensional

naive

If you use numpy anyway, then your algorithm can be made a lot simpler.

intervals = np.array(intervals)

Then you can just use comparison to check whether there is overlap

smaller = np.array(intervals[:,0]) <= new_interval[1]
larger = np.array(intervals[:,1]) > new_interval[0]

This doesn't use the fact that intervals is, or can be, sorted, but comparison in numpy is generally rather quick, and probably faster than sorting.

You have an intersection between intervals a and b when a[0] < b[1] and a[1] > b[0]

To get the indices of the overlapping intervals you can use:

intersection = smaller & larger

if not intersection.any():
 return None # or: raise(Exception('No intersection')
return np.argwhere(intersection)


return np.argwhere(smaller & larger)

sorted

If you want to use the fact the intervals is sorted, you can use np.searchsorted, instead of the direct comparison

intervals = np.sort(intervals, axis=0)
lower_index = np.searchsorted(intervals[:,1], new_interval[0])
upper_index = np.searchsorted(intervals[:,0], new_interval[1])

If upper_index > lower_index, there is an overlap

if upper_index <= lower_index:
 return None
return intervals[lower_index: upper_index]

if you need the indices, you can do return lower_index, upper_index or return np.arange(lower_index, upper_index). NB, this is on the sorted intervals

answered 2 days ago

Maarten FabrÃ©

3,504214

add a commentÂ |Â

up vote
5
down vote

return value

In case there is no overlap, it would be cleaner to either return a sentinel value (None for example), or raise an exception, and don't trigger any side effects, like the print

variable names

algorithm

In your algorithm you flatten intervals. There is no need for this, and you can do it a lot simpler if you let it be 2-dimensional

naive

If you use numpy anyway, then your algorithm can be made a lot simpler.

intervals = np.array(intervals)

Then you can just use comparison to check whether there is overlap

smaller = np.array(intervals[:,0]) <= new_interval[1]
larger = np.array(intervals[:,1]) > new_interval[0]

This doesn't use the fact that intervals is, or can be, sorted, but comparison in numpy is generally rather quick, and probably faster than sorting.

You have an intersection between intervals a and b when a[0] < b[1] and a[1] > b[0]

To get the indices of the overlapping intervals you can use:

intersection = smaller & larger

if not intersection.any():
 return None # or: raise(Exception('No intersection')
return np.argwhere(intersection)


return np.argwhere(smaller & larger)

sorted

If you want to use the fact the intervals is sorted, you can use np.searchsorted, instead of the direct comparison

intervals = np.sort(intervals, axis=0)
lower_index = np.searchsorted(intervals[:,1], new_interval[0])
upper_index = np.searchsorted(intervals[:,0], new_interval[1])

If upper_index > lower_index, there is an overlap

if upper_index <= lower_index:
 return None
return intervals[lower_index: upper_index]

if you need the indices, you can do return lower_index, upper_index or return np.arange(lower_index, upper_index). NB, this is on the sorted intervals

answered 2 days ago

Maarten FabrÃ©

3,504214

add a commentÂ |Â

up vote
5
down vote

return value

In case there is no overlap, it would be cleaner to either return a sentinel value (None for example), or raise an exception, and don't trigger any side effects, like the print

variable names

algorithm

In your algorithm you flatten intervals. There is no need for this, and you can do it a lot simpler if you let it be 2-dimensional

naive

If you use numpy anyway, then your algorithm can be made a lot simpler.

intervals = np.array(intervals)

Then you can just use comparison to check whether there is overlap

smaller = np.array(intervals[:,0]) <= new_interval[1]
larger = np.array(intervals[:,1]) > new_interval[0]

This doesn't use the fact that intervals is, or can be, sorted, but comparison in numpy is generally rather quick, and probably faster than sorting.

You have an intersection between intervals a and b when a[0] < b[1] and a[1] > b[0]

To get the indices of the overlapping intervals you can use:

intersection = smaller & larger

if not intersection.any():
 return None # or: raise(Exception('No intersection')
return np.argwhere(intersection)


return np.argwhere(smaller & larger)

sorted

If you want to use the fact the intervals is sorted, you can use np.searchsorted, instead of the direct comparison

intervals = np.sort(intervals, axis=0)
lower_index = np.searchsorted(intervals[:,1], new_interval[0])
upper_index = np.searchsorted(intervals[:,0], new_interval[1])

If upper_index > lower_index, there is an overlap

if upper_index <= lower_index:
 return None
return intervals[lower_index: upper_index]

if you need the indices, you can do return lower_index, upper_index or return np.arange(lower_index, upper_index). NB, this is on the sorted intervals

answered 2 days ago

Maarten FabrÃ©

3,504214

return value

In case there is no overlap, it would be cleaner to either return a sentinel value (None for example), or raise an exception, and don't trigger any side effects, like the print

variable names

algorithm

In your algorithm you flatten intervals. There is no need for this, and you can do it a lot simpler if you let it be 2-dimensional

naive

If you use numpy anyway, then your algorithm can be made a lot simpler.

intervals = np.array(intervals)

Then you can just use comparison to check whether there is overlap

smaller = np.array(intervals[:,0]) <= new_interval[1]
larger = np.array(intervals[:,1]) > new_interval[0]

This doesn't use the fact that intervals is, or can be, sorted, but comparison in numpy is generally rather quick, and probably faster than sorting.

You have an intersection between intervals a and b when a[0] < b[1] and a[1] > b[0]

To get the indices of the overlapping intervals you can use:

intersection = smaller & larger

if not intersection.any():
 return None # or: raise(Exception('No intersection')
return np.argwhere(intersection)


return np.argwhere(smaller & larger)

sorted

If you want to use the fact the intervals is sorted, you can use np.searchsorted, instead of the direct comparison

intervals = np.sort(intervals, axis=0)
lower_index = np.searchsorted(intervals[:,1], new_interval[0])
upper_index = np.searchsorted(intervals[:,0], new_interval[1])

If upper_index > lower_index, there is an overlap

if upper_index <= lower_index:
 return None
return intervals[lower_index: upper_index]

if you need the indices, you can do return lower_index, upper_index or return np.arange(lower_index, upper_index). NB, this is on the sorted intervals

answered 2 days ago

Maarten FabrÃ©

3,504214

answered 2 days ago

Maarten FabrÃ©

3,504214

answered 2 days ago

Maarten FabrÃ©

3,504214

answered 2 days ago

Maarten FabrÃ©

3,504214

add a commentÂ |Â

jibounet is a new contributor. Be nice, and check out our Code of Conduct.

draft saved

draft discarded

jibounet is a new contributor. Be nice, and check out our Code of Conduct.

draft saved

draft discarded

Post as a guest

Name

Category

Random preview

Find the intervals which have a non-empty intersection with a given interval

2 Answers 2

1. Review

2. Revised code if intervals are unsorted

3. Revised code if intervals are sorted

return value

variable names

algorithm

naive

sorted

Your Answer

Sign up or log in

Post as a guest

Post as a guest

2 Answers 2

2 Answers 2

1. Review

2. Revised code if intervals are unsorted

3. Revised code if intervals are sorted

1. Review

2. Revised code if intervals are unsorted

3. Revised code if intervals are sorted

1. Review

2. Revised code if intervals are unsorted

3. Revised code if intervals are sorted

1. Review

2. Revised code if intervals are unsorted

3. Revised code if intervals are sorted

return value

variable names

algorithm

naive

sorted

return value

variable names

algorithm

naive

sorted

return value

variable names

algorithm

naive

sorted

return value

variable names

algorithm

naive

sorted

Sign up or log in

Post as a guest

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Comments

Post a Comment

Popular posts from this blog

What does second last employer means? [closed]

List of Gilmore Girls characters

Confectionery

2 Answers
2

2 Answers
2

2 Answers
2