make_moons_check_data: Synthetic Two Moons Dataset¶

The make_moons_check_data function generates a synthetic "two moons" dataset. This classic dataset consists of two interleaving half-circles and is widely used for visualizing and benchmarking clustering and classification algorithms, particularly for testing non-linear decision boundaries.

Overview¶

The dataset forms two crescent shapes that are not linearly separable.

Class 0 (Upper Moon): A half-circle arching upwards.
Class 1 (Lower Moon): A half-circle arching downwards, shifted and interlocked with the upper moon.
Purpose: Ideal for testing kernel methods, neural networks, or advanced clustering algorithms (like GnosticLocalClustering) that can handle non-convex shapes.

Parameters¶

Parameter	Type	Description	Default
`n_samples`	int	Total number of data points to generate.	`30`
`noise`	float or None	Standard deviation of Gaussian noise added to data. None = No noise.	`None`
`seed`	int	Random seed for reproducibility.	`42`

Returns¶

Return	Type	Description
`X`	numpy.ndarray	Input feature array of shape `(n_samples, 2)`.
`y`	numpy.ndarray	Target label array of shape `(n_samples,)`.

Example Usage¶

from machinegnostics.datasets import make_moons_check_data
import numpy as np

# Generate noisy moon data
X, y = make_moons_check_data(n_samples=100, noise=0.1)

print(f"X shape: {X.shape}")
print(f"Unique classes: {np.unique(y)}")
# Output:
# X shape: (100, 2)
# Unique classes: [0 1]

Author: Nirmal Parmar