Coder Social home page Coder Social logo

Sweep: Refactor generally to improve quality the file individual.php and maintainbility plus readable by following psr standards about php-dna HOT 1 CLOSED

curtisdelicata avatar curtisdelicata commented on September 26, 2024 2
Sweep: Refactor generally to improve quality the file individual.php and maintainbility plus readable by following psr standards

from php-dna.

Comments (1)

sweep-ai avatar sweep-ai commented on September 26, 2024

🚀 Here's the PR! #157

See Sweep's progress at the progress dashboard!
💎 Sweep Pro: I'm using GPT-4. You have unlimited GPT-4 tickets. (tracking ID: 6bf084ed9e)

Tip

I can email you next time I complete a pull request if you set up your email here!


Actions (click)

  • ↻ Restart Sweep

Step 1: 🔎 Searching

I found the following snippets in your repository. I will now analyze these snippets and come up with a plan.

Some code snippets I think are relevant in decreasing order of relevance (click to expand). If some file is missing from here, you can mention the path in the ticket description.

<?php
require_once 'snps.php';
require_once 'snps/utils.php';
class Individual extends SNPs
{
/**
* Object used to represent and interact with an individual.
*
* The ``Individual`` object maintains information about an individual. The object provides
* methods for loading an individual's genetic data (SNPs) and normalizing it for use with the
* `lineage` framework.
*
* ``Individual`` inherits from ``snps.SNPs``.
*/
private string $_name;
public function __construct(string $name, mixed $raw_data = [], array $kwargs = [])
{
/**
* Initialize an ``Individual`` object.
*
* Parameters
* ----------
* name : str
* name of the individual
* raw_data : str, bytes, ``SNPs`` (or list or tuple thereof)
* path(s) to file(s), bytes, or ``SNPs`` object(s) with raw genotype data
* kwargs : array
* parameters to ``snps.SNPs`` and/or ``snps.SNPs.merge``
*/
$this->_name = $name;
$init_args = $this->_get_defined_kwargs(new ReflectionMethod(SNPs::class, '__construct'), $kwargs);
$merge_args = $this->_get_defined_kwargs(new ReflectionMethod(SNPs::class, ''), $kwargs);
parent::__construct(...array_values($init_args));
if (!is_array($raw_data)) {
$raw_data = [$raw_data];
}
foreach ($raw_data as $file) {
$s = $file instanceof SNPs ? $file : new SNPs($file, ...array_values($init_args));
$this->merge([$s], ...array_values($merge_args));
}
}
private function _get_defined_kwargs(ReflectionMethod $callable, array $kwargs): array
{
$parameters = $callable->getParameters();
$defined_kwargs = [];
foreach ($parameters as $parameter) {
$name = $parameter->getName();
if (array_key_exists($name, $kwargs)) {
$defined_kwargs[$name] = $kwargs[$name];
}
}
return $defined_kwargs;
}
public function __toString(): string
{
return sprintf("Individual('%s')", $this->_name);
}
public function getName(): string
{
/**
* Get this ``Individual``'s name.
*
* Returns
* -------
* str
*/
return $this->_name;
}
public function getVarName(): string
{
return clean_str($this->_name);

<?php
/**
* php-dna.
*
* Utility functions.
*
* @author Devmanateam <[email protected]>
* @copyright Copyright (c) 2020-2023, Devmanateam
* @license MIT
*
* @link http://github.com/familytree365/php-dna
*/
namespace Dna\Snps;
use Exception;
use ZipArchive;
use Dna\Snps\Utils;
/**
* The Singleton class defines the `GetInstance` method that serves as an
* alternative to constructor and lets clients access the same instance of this
* class over and over.
*/
// from multiprocessing import Pool; // You can use parallel or pthreads for multi-processing in PHP
// import os; // PHP has built-in OS functions
// import re; // PHP has built-in RegExp functions
// import shutil; // PHP has built-in filesystem functions
// import tempfile; // PHP has built-in temporary file functions
// import zipfile; // PHP has built-in ZipArchive class available
// from atomicwrites import atomic_write; // You can use a library or implement atomic writes in PHP
// import pandas as pd; // There is no direct PHP alternative to pandas; consider using array functions or a data manipulation library
class Parallelizer
{
private bool $_parallelize;
private ?int $_processes;
public function __construct(bool $parallelize = false, ?int $processes = null): void
{
$this->_parallelize = $parallelize;
/**
* Utils class provides utility functions for file manipulation, parallel processing,
* and other common tasks. It includes methods for gzipping files, creating directories,
* fetching current UTC time, saving data as CSV, cleaning strings, and zipping files.
*/
$this->_processes = $processes ?? os_cpu_count();
}
public function __invoke(callable $f, array $tasks): array
{
if ($this->_parallelize) {
// PHP does not have built-in support for parallel processing similar to Python's multiprocessing.
// Consider using alternative approaches or libraries for parallel processing in PHP.
// This example code is commented out as it requires the "parallel" PECL extension.
// $runtime = new \parallel\Runtime();
// $futures = [];
// foreach ($tasks as $task) {
// $futures[] = $runtime->run($f, [$task]);
// }
// return array_map(fn($future) => $future->value, $futures);
return array_map($f, $tasks); // Fallback to sequential processing
} else {
return array_map($f, $tasks);
}
}
function os_cpu_count(): int
{
// Use this function if you need to get the number of CPU cores in PHP
// You might need to adjust this code based on your environment
if (substr(php_uname('s'), 0, 7) == 'Windows') {
return (int) shell_exec('echo %NUMBER_OF_PROCESSORS%');
} else {
return (int) shell_exec('nproc');
}
}
}
class Utils
{
public static function gzip_file(string $src, string $dest): string
{
/**
* Gzip a file.
*
* @param string $src Path to file to gzip
* @param string $dest Path to output gzip file
*
* @return string Path to gzipped file
*/
$bufferSize = 4096;
$srcFile = fopen($src, "rb");
if ($srcFile === false) {
throw new Exception("Cannot open source file");
}
try {
$destFile = fopen($dest, "wb");
if ($destFile === false) {
throw new Exception("Cannot create destination file");
}
try {
$gzFile = gzopen($dest, "wb");
if ($gzFile === false) {
throw new Exception("Cannot create gzipped file");
}
try {
while (!feof($srcFile)) {
$buffer = fread($srcFile, $bufferSize);
gzwrite($gzFile, $buffer);
}
} finally {
gzclose($gzFile);
}
} finally {
fclose($destFile);
}
} finally {
fclose($srcFile);
}
return $dest;
}
}
/**
* Creates a directory if it doesn't exist.
*
* @param string $path Path to the directory to create.
* @return void
*/
public static function create_dir(string $path): void
{
if (!file_exists($path)) {
mkdir($path, 0777, true);
}
}
/**
* Gets the current UTC time.
*
* @return string Current UTC time in 'Y-m-d H:i:s' format.
*/
public static function get_utc_now(): string
{
return gmdate('Y-m-d H:i:s');
}
/**
* Saves data as a CSV file.
*
* @param array $data Data to save.
* @param string $filename Path to the CSV file.
* @return void
*/
public static function save_df_as_csv(array $data, string $filename): void
{
$fp = fopen($filename, 'w');
foreach ($data as $row) {
fputcsv($fp, $row);
}
fclose($fp);
}
/**
* Cleans a string to be used as a variable name.
*
* @param string $str String to clean.
* @return string Cleaned string.
*/
public static function clean_str(string $str): string
{
return preg_replace('/[^A-Za-z0-9_]/', '', $str);
}
/**
* Zips a file.
*
* @param string $src Path to the file to zip.
* @param string $dest Path to the output zip file.
* @return void
*/
public static function zip_file(string $src, string $dest): void
{
$zip = new ZipArchive();
if ($zip->open($dest, ZipArchive::CREATE) === TRUE) {
$zip->addFile($src, basename($src));
$zip->close();
}

<?php
declare(strict_types=1);
namespace DnaTest\Snps;
use Dna\Snps\SNPs;
class SnpsTest extends BaseSNPsTestCase
{
// private $table;
protected function setUp(): void
{
parent::setUp();
}
public static function empty_snps()
{
return [new SNPs(), new SNPs(""), new SNPs("tests/input/empty.txt")];
}
public function test_len()
{
$s = new SNPs("tests/input/generic.csv");
$this->assertEquals(count($s), 8);
}
public function test_len_empty()
{
foreach (self::empty_snps() as $s) {
$this->assertEquals(count($s), 0);
}
}
public function test__toString()
{
$s = new SNPs("tests/input/GRCh37.csv");
$this->assertEquals("SNPs('GRCh37.csv')", $s->__toString());
}
public function test__toString_bytes()
{
$data = file_get_contents("tests/input/GRCh37.csv");
$s = new SNPs($data);
$this->assertEquals("SNPs(<bytes>)", $s->__toString());
}
public function testAssembly()
{
$s = new SNPs("tests/input/GRCh38.csv");
$this->assertEquals($s->getAssembly(), "GRCh38");
}
public function testAssemblyNoSnps()
{
$emptySnps = $this->empty_snps();
foreach ($emptySnps as $snps) {
$this->assertEmpty($snps->getAssembly());
}
}
public function testBuild()
{
$s = new SNPs("tests/input/NCBI36.csv");
$this->assertEquals($s->getBuild(), 36);
$this->assertEquals($s->getAssembly(), "NCBI36");
}
public function testBuildDetectedNoSnps()
{
$emptySnps = $this->empty_snps();
foreach ($emptySnps as $snps) {
$this->assertFalse($snps->isBuildDetected());
}
}
public function testBuildNoSnps()
{
$emptySnps = $this->empty_snps();
foreach ($emptySnps as $snps) {
$this->assertEmpty($snps->getBuild());
}
}
public function testBuildDetectedPARSnps()
{
$snps = $this->loadAssignPARSnps('tests/input/GRCh37_PAR.csv');
$this->assertEquals(37, $snps->getBuild());
$this->assertTrue($snps->isBuildDetected());
$expectedSnps = $this->snps_GRCh37_PAR();
$actualSnps = $snps->getSnps();
$this->assertEquals($expectedSnps, $actualSnps);
}
public function test_notnull()
{
$s = new SNPs("tests/input/generic.csv");
$snps = $this->generic_snps();
unset($snps["rs5"]);
$this->assertEquals($s->notnull(), $snps, "Frames are not equal!");
}
public function test_heterozygous()
{
$s = new SNPs("tests/input/generic.csv");
$expected = $this->create_snp_df(
rsid: ["rs6", "rs7", "rs8"],
chrom: ["1", "1", "1"],
pos: [106, 107, 108],
genotype: ["GC", "TC", "AT"]
);
$this->assertEquals($expected, $s->heterozygous(), "Frames are not equal!");
}
public function test_homozygous()
{
$s = new SNPs("tests/input/generic.csv");
$expected = $this->create_snp_df(
rsid: ["rs1", "rs2", "rs3", "rs4"],
chrom: ["1", "1", "1", "1"],
pos: [101, 102, 103, 104],
genotype: ["AA", "CC", "GG", "TT"],
);
$this->assertEquals($expected, $s->homozygous(), "Frames are not equal!");
}
public function test_homozygous_chrom()
{
$s = new SNPs("tests/input/generic.csv");
$expected = $this->create_snp_df(
rsid: ["rs1", "rs2", "rs3", "rs4"],
chrom: ["1", "1", "1", "1"],
pos: [101, 102, 103, 104],
genotype: ["AA", "CC", "GG", "TT"],
);
$this->assertEquals($expected, $s->homozygous("1"), "Frames are not equal!");
}
public function test_valid_False()
{
foreach ($this->empty_snps() as $snps) {
$this->assertFalse($snps->isValid());
}
}
public function test_valid_True()
{
$s = new SNPs("tests/input/generic.csv");
$this->assertTrue($s->isValid());
}
public function test_only_detect_source()
{
$s = new SNPs("tests/input/generic.csv", true);
$this->assertEquals($s->getSource(), "generic");
$this->assertEquals(count($s), 0);
}
public function test_empty_dataframe()
{
// for snps in self.empty_snps():
// self.assertListEqual(
// list(snps.snps.columns.values), ["chrom", "pos", "genotype"]
// )
// self.assertEqual(snps.snps.index.name, "rsid")
// foreach ($this->empty_snps() as $snps) {
// $this->assertEquals(
// $snps->getSnps()->columns->toArray(),
// ["chrom", "pos", "genotype"]
// );
// $this->assertEquals($snps->getSnps()->index->name, "rsid");
// }
}
public function test_assembly_None()
{
$snps = new SNPs();
$this->assertFalse($snps->getAssembly());
}
// def test_summary(self):
// s = SNPs("tests/input/GRCh38.csv")
// self.assertDictEqual(
// s.summary,
// {
// "source": "generic",
// "assembly": "GRCh38",
// "build": 38,
// "build_detected": True,
// "count": 4,
// "chromosomes": "1, 3",
// "sex": "",
// },
// )
public function test_summary()
{
$s = new SNPs("tests/input/GRCh38.csv");
$this->assertEquals(
$s->getSummary(),
[
"source" => "generic",
"assembly" => "GRCh38",
"build" => 38,
"build_detected" => true,
"count" => 4,
"chromosomes" => "1, 3",
"sex" => "",
]
);
}
public function test_summary_no_snps()
{
foreach ($this->empty_snps() as $snps) {
$this->assertEquals($snps->getSummary(), []);
}
}
public function test_chromosomes()
{
$s = new SNPs("tests/input/chromosomes.csv");
var_dump($s->getChromosomes());
$this->assertEquals(["1", "2", "3", "5", "PAR", "MT"], $s->getChromosomes());
}
public function test_chromosomes_no_snps()
{
foreach ($this->empty_snps() as $snps) {
$this->assertEmpty($snps->getChromosomes());
}
}
public function test_sex_Female_X_chrom()
{
$s = $this->simulate_snps(
chrom: "X",
pos_start: 1,
pos_max: 155270560,
pos_step: 10000,
genotype: "AC"
);
$this->assertEquals("Female", $s->getSex());
}
public function test_sex_Female_Y_chrom()
{
$s = $this->simulate_snps(
chrom: "Y",
pos_start: 1,
pos_max: 59373566,
pos_step: 10000,
null_snp_step: 1
);
$this->assertEquals("Female", $s->getSex());
}
// def test_sex_Male_X_chrom(self):
// s = self.simulate_snps(
// chrom="X", pos_start=1, pos_max=155270560, pos_step=10000, genotype="AA"
// )
// self.assertEqual(s.count, 15528)
// s._deduplicate_XY_chrom()
// self.assertEqual(s.count, 15528)
// self.assertEqual(len(s.discrepant_XY), 0)
// self.assertEqual(s.sex, "Male")
public function test_sex_Male_X_chrom()
{
$s = $this->simulate_snps(
chrom: "X",
pos_start: 1,
pos_max: 155270560,
pos_step: 10000,
genotype: "AA"
);
$this->assertEquals(15528, $s->count());
$s->deduplicate_XY_chrom();
$this->assertEquals(15528, $s->count());
$this->assertEquals(0, count($s->getDiscrepantXY()));
$this->assertEquals("Male", $s->getSex());
}
// def test_sex_Male_X_chrom_discrepant_XY(self):
// s = self.simulate_snps(
// chrom="X", pos_start=1, pos_max=155270560, pos_step=10000, genotype="AA"
// )
// self.assertEqual(s.count, 15528)
// s._snps.loc["rs8001", "genotype"] = "AC"
// s._deduplicate_XY_chrom()
// self.assertEqual(s.count, 15527)
// result = self.create_snp_df(
// rsid=["rs8001"], chrom=["X"], pos=[80000001], genotype=["AC"]
// )
// pd.testing.assert_frame_equal(s.discrepant_XY, result, check_exact=True)
// self.assertEqual(s.sex, "Male")
public function test_sex_Male_X_chrom_discrepant_XY()
{
$s = $this->simulate_snps(
chrom: "X",
pos_start: 1,
pos_max: 155270560,
pos_step: 10000,
genotype: "AA"
);
$this->assertEquals(15528, $s->count());
// $s->getSnps()->loc["rs8001", "genotype"] = "AC";
$s->setValue("rs8001", "genotype", "AC");
$s->deduplicate_XY_chrom();
$this->assertEquals(15527, $s->count());
$result = $this->create_snp_df(
rsid: ["rs8001"],
chrom: ["X"],
pos: [80000001],
genotype: ["AC"]
);
$this->assertEquals($result, $s->getDiscrepantXY());
$this->assertEquals("Male", $s->getSex());
}
// def test_sex_Male_Y_chrom(self):
// s = self.simulate_snps(chrom="Y", pos_start=1, pos_max=59373566, pos_step=10000)
// self.assertEqual(s.sex, "Male")
public function test_sex_male_Y_chrom()
{
$s = $this->simulate_snps(
chrom: "Y",
pos_start: 1,
pos_max: 59373566,
pos_step: 10000
);
$this->assertEquals("Male", $s->getSex());
}
// def test_sex_not_determined(self):
// s = self.simulate_snps(
// chrom="1", pos_start=1, pos_max=249250621, pos_step=10000
// )
// self.assertEqual(s.sex, "")
public function test_sex_not_determined()
{
$s = $this->simulate_snps(
chrom: "1",
pos_start: 1,
pos_max: 249250621,
pos_step: 10000
);
$this->assertEquals("", $s->getSex());
}
// def test_sex_no_snps(self):
// for snps in self.empty_snps():
// self.assertFalse(snps.sex)
public function test_sex_no_snps()
{
foreach ($this->empty_snps() as $snps) {
$this->assertEmpty($snps->getSex());
}
}
public function test_source()
{
$s = new SNPs("tests/input/generic.csv");
$this->assertEquals("generic", $s->getSource());
$this->assertEquals(["generic"], $s->getAllSources());
}
public function test_source_no_snps()
{
foreach ($this->empty_snps() as $snps) {
$this->assertEmpty($snps->getSource());
}
}
public function test_count()
{
$s = new SNPs("tests/input/NCBI36.csv");
$this->assertEquals(4, $s->count());
}
public function test_count_no_snps()
{
foreach ($this->empty_snps() as $snps) {
$this->assertEquals(0, $snps->count());
$this->assertEmpty($snps->getSnps());
}
}
public function testDeduplicateFalse()
{
$snps = new SNPs("tests/input/duplicate_rsids.csv", deduplicate: false);
$result = $this->create_snp_df(["rs1", "rs1", "rs1"], ["1", "1", "1"], [101, 102, 103], ["AA", "CC", "GG"]);
$this->assertEquals($result, $snps->snps);
}
public function testDeduplicateMTChrom()
{
$snps = new SNPs("tests/input/ancestry_mt.txt");
$result = $this->create_snp_df(["rs1", "rs2"], ["MT", "MT"], [101, 102], ["A", null]);
$this->assertEquals($result, $snps->snps);
$heterozygousMTSnps = $this->create_snp_df(["rs3"], ["MT"], [103], ["GC"]);
$this->assertEquals($heterozygousMTSnps, $snps->heterozygous_MT);
}
public function testDeduplicateMTChromFalse()
{
$snps = new SNPs("tests/input/ancestry_mt.txt", deduplicate: false);
$result = $this->create_snp_df(["rs1", "rs2", "rs3"], ["MT", "MT", "MT"], [101, 102, 103], ["AA", null, "GC"]);
$this->assertEquals($result, $snps->snps);
}
public function testDuplicateRsids()
{
$snps = new SNPs("tests/input/duplicate_rsids.csv");
$result = $this->create_snp_df(["rs1"], ["1"], [101], ["AA"]);
$duplicate = $this->create_snp_df(["rs1", "rs1"], ["1", "1"], [102, 103], ["CC", "GG"]);
$this->assertEquals($result, $snps->getSnps());
$this->assertEquals($duplicate, $snps->duplicate);
}
public function _run_remap_test($f, $mappings)
{
if ($this->downloads_enabled) {
$f();
} else {
$mock = $this->createMock(Resources::class);
$mock->method('get_assembly_mapping_data')->willReturn($mappings);
$this->getMockBuilder(Resources::class)
->setMethods(['get_assembly_mapping_data'])
->getMock();
$f();
}
}
public function test_remap_36_to_37()
{
$this->_run_remap_test(function () {
$s = new SNPs("tests/input/NCBI36.csv");
list($chromosomes_remapped, $chromosomes_not_remapped) = $s->remap(37);
$this->assertEquals(37, $s->build);
$this->assertEquals("GRCh37", $s->assembly);
$this->assertCount(2, $chromosomes_remapped);
$this->assertCount(0, $chromosomes_not_remapped);
$this->assertEquals($this->snps_GRCh37(), $s->getSnps());
}, $this->NCBI36_GRCh37());
}
public function test_remap_36_to_37_multiprocessing()
{
$this->_run_remap_test(function () {
$s = new SNPs("tests/input/NCBI36.csv", true);
[$chromosomes_remapped, $chromosomes_not_remapped] = $s->remap(37);
$this->assertEquals(37, $s->build);
$this->assertEquals("GRCh37", $s->assembly);
$this->assertCount(2, $chromosomes_remapped);
$this->assertCount(0, $chromosomes_not_remapped);
$this->assertSnpsArrayEquals($s->snps, $this->snps_GRCh37(), true);
}, $this->NCBI36_GRCh37());
}
public function test_remap_37_to_36()
{
$this->_run_remap_test(function () {
$s = new SNPs("tests/input/GRCh37.csv");
[$chromosomes_remapped, $chromosomes_not_remapped] = $s->remap(36);
$this->assertEquals(36, $s->build);
$this->assertEquals("NCBI36", $s->assembly);
$this->assertCount(2, $chromosomes_remapped);
$this->assertCount(0, $chromosomes_not_remapped);
$this->assertSnpsArrayEquals($s->snps, $this->snps_NCBI36(), true);
}, $this->GRCh37_NCBI36());
}
public function test_remap_37_to_38()
{
$this->_run_remap_test(function () {
$s = new SNPs("tests/input/GRCh37.csv");
[$chromosomes_remapped, $chromosomes_not_remapped] = $s->remap(38);
$this->assertEquals(38, $s->build);
$this->assertEquals("GRCh38", $s->assembly);
$this->assertCount(2, $chromosomes_remapped);
$this->assertCount(0, $chromosomes_not_remapped);
$this->assertSnpsArrayEquals($s->snps, $this->snps_GRCh38(), true);
}, $this->GRCh37_GRCh38());
}
public function test_remap_37_to_38_with_PAR_SNP()
{
$this->_run_remap_test(function () {
$s = $this->loadAssignPARSNPs("tests/input/GRCh37_PAR.csv");
$this->assertEquals(4, $s->count);
[$chromosomes_remapped, $chromosomes_not_remapped] = $s->remap(38);
$this->assertEquals(38, $s->build);
$this->assertEquals("GRCh38", $s->assembly);
$this->assertCount(2, $chromosomes_remapped);
$this->assertCount(1, $chromosomes_not_remapped);
$this->assertEquals(3, $s->count);
$this->assertSnpsArrayEquals($s->snps, $this->snps_GRCh38_PAR(), true);
}, $this->GRCh37_GRCh38_PAR());
}
public function test_remap_37_to_37()
{
$s = new SNPs("tests/input/GRCh37.csv");
[$chromosomes_remapped, $chromosomes_not_remapped] = $s->remap(37);
$this->assertEquals(37, $s->build);
$this->assertEquals("GRCh37", $s->assembly);
$this->assertCount(0, $chromosomes_remapped);
$this->assertCount(2, $chromosomes_not_remapped);
$this->assertSnpsArrayEquals($s->snps, $this->snps_GRCh37(), true);
}
public function test_remap_invalid_assembly()
{
$s = new SNPs("tests/input/GRCh37.csv");
[$chromosomes_remapped, $chromosomes_not_remapped] = $s->remap(-1);
$this->assertEquals(37, $s->build);
$this->assertEquals("GRCh37", $s->assembly);
$this->assertCount(0, $chromosomes_remapped);
$this->assertCount(2, $chromosomes_not_remapped);
}
public function test_remap_no_snps()
{
$s = new SNPs();
[$chromosomes_remapped, $chromosomes_not_remapped] = $s->remap(38);
$this->assertFalse($s->build);
$this->assertCount(0, $chromosomes_remapped);
$this->assertCount(0, $chromosomes_not_remapped);
}
public function testSaveBufferBinary()
{
$s = new SNPs("tests/input/generic.csv");
$out = fopen('php://memory', 'wb');
$s->toTsv($out);
rewind($out);
$this->assertTrue(strpos(stream_get_contents($out), "# Generated by snps") === 0);
}
public function testSaveNoSNPs()
{
$s = new SNPs();
$this->assertFalse($s->toTsv());
}
public function testSaveNoSNPsVCF()
{
$s = new SNPs();
$this->assertFalse($s->toVcf());
}
public function testSaveSource()
{
$tmpdir = sys_get_temp_dir();
$s = new SNPs("tests/input/GRCh38.csv", output_dir: $tmpdir);
$dest = $tmpdir . DIRECTORY_SEPARATOR . "generic_GRCh38.txt";
$this->assertEquals($s->toTsv(), $dest);
$snps = new SNPs($dest);
$this->assertEquals($snps->build, 38);
$this->assertTrue($snps->buildDetected);
$this->assertEquals($snps->source, "generic");
$this->assertEquals($snps->_source, ["generic"]);
$this->assertEquals($this->snps_GRCh38(), $snps->getSnps());
}
private function makeAncestryAssertions($d)
{
$this->assertEquals($d["population_code"], "ITU");
$this->assertEquals($d["population_description"], "Indian Telugu in the UK");
$this->assertIsFloat($d["population_percent"]);
$this->assertGreaterThanOrEqual(0.2992757864426246 - 0.00001, $d["population_percent"]);
$this->assertLessThanOrEqual(0.2992757864426246 + 0.00001, $d["population_percent"]);
$this->assertEquals($d["superpopulation_code"], "SAS");
$this->assertEquals($d["superpopulation_description"], "South Asian Ancestry");
$this->assertIsFloat($d["superpopulation_percent"]);
$this->assertGreaterThanOrEqual(0.827977563875996 - 0.00001, $d["superpopulation_percent"]);
$this->assertLessThanOrEqual(0.827977563875996 + 0.00001, $d["superpopulation_percent"]);
$this->assertArrayHasKey("predicted_population_population", $d["ezancestry_df"]);
$this->assertArrayHasKey("predicted_population_superpopulation", $d["ezancestry_df"]);
}
// public function testAncestry()
// {
// $ezancestryMods = ["ezancestry", "ezancestry.commands"];
// $poppedMods = $this->popModules($ezancestryMods);
// if (extension_loaded("ezancestry")) {
// // Test with ezancestry if installed
// $s = new SNPs("tests/input/generic.csv");
// $this->makeAncestryAssertions($s->predictAncestry());
// }
// // Mock ezancestry modules
// foreach ($ezancestryMods as $mod) {
// $this->setMockedModule($mod);
// }
// // Mock the predict function
// $mockedData = [
// "predicted_population_population" => ["ITU"],
// "population_description" => ["Indian Telugu in the UK"],
// "ITU" => [0.2992757864426246],
// "predicted_population_superpopulation" => ["SAS"],
// "superpopulation_name" => ["South Asian Ancestry"],
// "SAS" => [0.827977563875996],
// ];
// $this->setMockedFunction("ezancestry.commands", "predict", $mockedData);
// // Test with mocked ezancestry
// $s = new SNPs("tests/input/generic.csv");
// $this->makeAncestryAssertions($s->predictAncestry());
// // Unload mocked ezancestry modules
// $this->popModules($ezancestryMods);
// // Restore ezancestry modules if ezancestry is installed
// $this->restoreModules($poppedMods);
// }
public function testAncestryModuleNotFoundError()
{
if (!extension_loaded("ezancestry")) {
// Test when ezancestry is not installed
$s = new SNPs("tests/input/generic.csv");
$this->expectException(ModuleNotFoundError::class);
$this->expectExceptionMessage("Ancestry prediction requires the ezancestry package; please install it using `composer require ezancestry/ezancestry`");
$s->predictAncestry();
}
}
private function getChipClusters($pos = [101, 102, 103, 104, 105, 106, 107, 108], $cluster = "c1", $length = 8)
{
$data = [];
for ($i = 0; $i < $length; $i++) {
$data[] = [
"chrom" => "1",
"pos" => $pos[$i],
"clusters" => $cluster
];
}
return collect($data);
}
public function runClusterTest($f, $chipClusters)
{
$mock = $this->getMockBuilder(Resources::class)
->setMethods(['getChipClusters'])
->getMock();
$mock->method('getChipClusters')
->willReturn($chipClusters);
$this->assertInstanceOf(Resources::class, $mock);
$f($mock);
}
public function testCluster()
{
$this->runClusterTest(function ($mock) {
$s = new SNPs("tests/input/23andme.txt", $mock);
$this->assertEquals($s->getCluster(), "c1");
}, $this->getChipClusters());
}
public function testChip()
{
$this->runClusterTest(function ($mock) {
$s = new SNPs("tests/input/23andme.txt", $mock);
$this->assertEquals($s->getChip(), "HTS iSelect HD");
}, $this->_getChipClusters());
}
public function testChipVersion()
{
$this->runClusterTest(function ($mock) {
$s = new SNPs("tests/input/23andme.txt", $mock);
$this->assertEquals($s->getChipVersion(), "v4");
}, $this->getChipClusters());
}
public function testChipVersionNA()
{
$this->runClusterTest(function ($mock) {
$s = new SNPs("tests/input/myheritage.csv", $mock);
$this->assertEquals($s->getCluster(), "c3");
$this->assertEquals($s->getChipVersion(), "");
}, $this->getChipClusters("c3"));
}
public function testComputeClusterOverlapSetPropertyValues()
{
$this->runClusterTest(function ($mock) {
$s = new SNPs("tests/input/23andme.txt", $mock);
$s->computeClusterOverlap();
$this->assertEquals($s->getCluster(), "c1");
$this->assertEquals($s->getChip(), "HTS iSelect HD");
$this->assertEquals($s->getChipVersion(), "v4");
}, $this->_getChipClusters());
}
public function testComputeClusterOverlapThresholdNotMet()
{
$this->runClusterTest(function ($mock) {
$s = new SNPs("tests/input/23andme.txt", $mock);
$this->assertEquals($s->getCluster(), "");
}, $this->_getChipClusters(range(104, 112)));
}
public function testComputeClusterOverlapSourceWarning()
{
$this->runClusterTest(function ($mock) {
$s = new SNPs("tests/input/generic.csv", $mock);
$this->assertEquals($s->getCluster(), "c1");
}, $this->_getChipClusters());
$logs = $this->getActualOutput();
$this->assertStringContainsString(
"Detected SNPs data source not found in cluster's company composition",
$logs
);
}
public function testComputeClusterOverlapRemap()
{
$this->runClusterTest(function ($mock) {
$s = new SNPs("tests/input/23andme.txt", $mock);
// drop SNPs not currently remapped by test mapping data
$snps = $s->getSnps();
unset($snps["rs4"]);
unset($snps["rs5"]);
unset($snps["rs6"]);
unset($snps["rs7"]);
unset($snps["rs8"]);
$s->setBuild(36); // manually set build 36
$this->assertEquals($s->getCluster(), "c1");
$this->assertEquals($s->getBuild(), 36); // ensure copy gets remapped
}, $this->_getChipClusters(["pos" => range(101, 104)], 3));
}
public function testSnpsQc()
{
// Simulate the creation of your SNP object with the provided CSV data
$s = new SNPs("tests/input/generic.csv");
// Identify quality controlled SNPs and get them as an array or other data structure
$snpsQc = $s->getSnpsQc();
// Create an array that represents your expected QC SNPs, excluding rs4 and rs6
$expectedQcSnps = $this->genericSnps();
unset($expectedQcSnps['rs4']);
unset($expectedQcSnps['rs6']);
// Assert that the computed QC SNPs match the expected QC SNPs
$this->assertEquals($expectedQcSnps, $snpsQc);
}
public function testLowQuality()
{
// Simulate the creation of your SNP object with the provided CSV data
$s = new SNPs("tests/input/generic.csv");
// Identify low-quality SNPs and get them as an array or other data structure
$lowQualitySnps = $s->getLowQualitySnps();
// Create an array that represents your expected low-quality SNPs, including rs4 and rs6
$expectedLowQualitySnps = $this->genericSnps();
// Assert that the computed low-quality SNPs match the expected low-quality SNPs
$this->assertEquals($expectedLowQualitySnps, $lowQualitySnps);
}
// public function testSnpsQcLowQualityNoCluster() {
// function f() {
// $s = new SNPs("tests/input/generic.csv");
// // Identify low-quality SNPs
// $this->assertEquals(
// $s->low_quality,
// $this->getLowQualitySnps(['rs4', 'rs6'])
// );
// // Return already identified low-quality SNPs (test branch)
// $this->assertEquals(
// $s->low_quality,
// $this->getLowQualitySnps(['rs4', 'rs6'])
// );
// }
// $this->runLowQualitySnpsTest('f', $this->getLowQualitySnps(), ['cluster' => '']);
// }
// private function testIdentifyLowQualitySnpsRemap() {
// $f = function() {
// $s = new SNPs("tests/input/generic.csv");
// // Drop SNPs not currently remapped by test mapping data
// $s->_snps->drop(["rs4", "rs5", "rs6", "rs7", "rs8"], 1);
// $s->_build = 36; // Manually set build 36
// $s->identifyLowQualitySnps();
// $this->assertEquals($s->snpsQc, $this->getLowQualitySnps(['rs1', 'rs3']));
// $this->assertEquals($s->lowQuality, $this->getLowQualitySnps()['rs2']);
// $this->assertEquals($s->build, 36); // Ensure copy gets remapped
// }
// $mock = $this->getMockBuilder('Resources')
// ->setMethods(['getAssemblyMappingData'])
// ->getMock();
// $mock->expects($this->any())
// ->method('getAssemblyMappingData')
// ->willReturn($this->getTestAssemblyMappingData(
// "NCBI36",
// "GRCh37",
// array_fill(0, 8, 1),
// array(101, 101, 102, 102, 103, 103, 0, 0)
// ));
// $this->runLowQualitySnpsTest('f', $this->getLowQualitySnps(array(102, 1001)));
// }


Step 2: ⌨️ Coding

Modify src/Individual.php with contents: Rename private properties and methods to follow PSR naming conventions.

<original_code>
private string $_name;

public function __construct(string $name, mixed $raw_data = [], array $kwargs = [])
{
    /**
     * Initialize an ``Individual`` object.
     *
     * Parameters
     * ----------
     * name : str
     *     name of the individual
     * raw_data : str, bytes, ``SNPs`` (or list or tuple thereof)
     *     path(s) to file(s), bytes, or ``SNPs`` object(s) with raw genotype data
     * kwargs : array
     *     parameters to ``snps.SNPs`` and/or ``snps.SNPs.merge``
     */
    $this->_name = $name;

    $init_args = $this->_get_defined_kwargs(new ReflectionMethod(SNPs::class, '__construct'), $kwargs);
    $merge_args = $this->_get_defined_kwargs(new ReflectionMethod(SNPs::class, ''), $kwargs);

</original_code>

<new_code>
private string $name;

public function __construct(string $name, mixed $rawData = [], array $kwargs = [])
{
    /**
     * Initialize an ``Individual`` object.
     *
     * @param string $name Name of the individual
     * @param mixed $rawData Path(s) to file(s), bytes, or ``SNPs`` object(s) with raw genotype data
     * @param array $kwargs Parameters to ``snps.SNPs`` and/or ``snps.SNPs.merge``
     */
    $this->name = $name;

    $initArgs = $this->getDefinedKwargs(new ReflectionMethod(SNPs::class, '__construct'), $kwargs);
    $mergeArgs = $this->getDefinedKwargs(new ReflectionMethod(SNPs::class, 'merge'), $kwargs);

</new_code>

Modify src/Individual.php with contents: Add type hints, return type declarations, and docblock comments to all methods.

<original_code>
private function _get_defined_kwargs(ReflectionMethod $callable, array $kwargs): array
{
$parameters = $callable->getParameters();
$defined_kwargs = [];
foreach ($parameters as $parameter) {
$name = $parameter->getName();
if (array_key_exists($name, $kwargs)) {
$defined_kwargs[$name] = $kwargs[$name];
}
}

    return $defined_kwargs;
}

public function __toString(): string
{
    return sprintf("Individual('%s')", $this->_name);
}

public function getName(): string
{
    /**
     * Get this ``Individual``'s name.
     *
     * Returns
     * -------
     * str
     */
    return $this->_name;
}

public function getVarName(): string
{
    return clean_str($this->_name);
}

</original_code>

<new_code>
/**
* Get defined keyword arguments for a method
*
* @param ReflectionMethod $method The method to get arguments for
* @param array $kwargs The keyword arguments to filter
* @return array The defined keyword arguments
*/
private function getDefinedKwargs(ReflectionMethod $method, array $kwargs): array
{
$parameters = $method->getParameters();
$definedKwargs = [];
foreach ($parameters as $parameter) {
$name = $parameter->getName();
if (array_key_exists($name, $kwargs)) {
$definedKwargs[$name] = $kwargs[$name];
}
}

    return $definedKwargs;
}

/**
 * Get the string representation of the Individual
 *
 * @return string The string representation
 */
public function __toString(): string
{
    return sprintf("Individual('%s')", $this->name);
}

/**
 * Get the Individual's name
 *
 * @return string The name
 */
public function getName(): string
{
    return $this->name;
}

/**
 * Get a variable-safe version of the Individual's name
 *
 * @return string The variable-safe name
 */
public function getVarName(): string
{
    return clean_str($this->name);
}

</new_code>

Modify src/Individual.php with contents: Simplify constructor logic and use more descriptive variable names.

<original_code>
public function __construct(string $name, mixed $raw_data = [], array $kwargs = [])
{
/**
* Initialize an Individual object.
*
* Parameters
* ----------
* name : str
* name of the individual
* raw_data : str, bytes, SNPs (or list or tuple thereof)
* path(s) to file(s), bytes, or SNPs object(s) with raw genotype data
* kwargs : array
* parameters to snps.SNPs and/or snps.SNPs.merge
*/
$this->_name = $name;

    $init_args = $this->_get_defined_kwargs(new ReflectionMethod(SNPs::class, '__construct'), $kwargs);
    $merge_args = $this->_get_defined_kwargs(new ReflectionMethod(SNPs::class, ''), $kwargs);

    parent::__construct(...array_values($init_args));

    if (!is_array($raw_data)) {
        $raw_data = [$raw_data];
    }

    foreach ($raw_data as $file) {
        $s = $file instanceof SNPs ? $file : new SNPs($file, ...array_values($init_args));
        $this->merge([$s], ...array_values($merge_args));
    }
}

</original_code>

<new_code>
/**
* Initialize an Individual object
*
* @param string $name Name of the individual
* @param mixed $rawData Path(s) to file(s), bytes, or SNPs object(s) with raw genotype data
* @param array $kwargs Parameters to snps.SNPs and/or snps.SNPs.merge
*/
public function __construct(string $name, mixed $rawData = [], array $kwargs = [])
{
$this->name = $name;

    $snpsConstructorArgs = $this->getDefinedKwargs(new ReflectionMethod(SNPs::class, '__construct'), $kwargs);
    $snpsMergeArgs = $this->getDefinedKwargs(new ReflectionMethod(SNPs::class, 'merge'), $kwargs);

    parent::__construct(...array_values($snpsConstructorArgs));

    $rawDataArray = is_array($rawData) ? $rawData : [$rawData];

    foreach ($rawDataArray as $data) {
        $snps = $data instanceof SNPs ? $data : new SNPs($data, ...array_values($snpsConstructorArgs));
        $this->merge([$snps], ...array_values($snpsMergeArgs));
    }
}

</new_code>

Modify src/Individual.php with contents: Extract logic for creating SNPs objects into a separate private method.

<original_code>
public function __construct(string $name, mixed $raw_data = [], array $kwargs = [])
{
/**
* Initialize an Individual object.
*
* Parameters
* ----------
* name : str
* name of the individual
* raw_data : str, bytes, SNPs (or list or tuple thereof)
* path(s) to file(s), bytes, or SNPs object(s) with raw genotype data
* kwargs : array
* parameters to snps.SNPs and/or snps.SNPs.merge
*/
$this->_name = $name;

    $init_args = $this->_get_defined_kwargs(new ReflectionMethod(SNPs::class, '__construct'), $kwargs);
    $merge_args = $this->_get_defined_kwargs(new ReflectionMethod(SNPs::class, ''), $kwargs);

    parent::__construct(...array_values($init_args));

    if (!is_array($raw_data)) {
        $raw_data = [$raw_data];
    }

    foreach ($raw_data as $file) {
        $s = $file instanceof SNPs ? $file : new SNPs($file, ...array_values($init_args));
        $this->merge([$s], ...array_values($merge_args));
    }
}

</original_code>

<new_code>
/**
* Initialize an Individual object
*
* @param string $name Name of the individual
* @param mixed $rawData Path(s) to file(s), bytes, or SNPs object(s) with raw genotype data
* @param array $kwargs Parameters to snps.SNPs and/or snps.SNPs.merge
*/
public function __construct(string $name, mixed $rawData = [], array $kwargs = [])
{
$this->name = $name;

    $snpsConstructorArgs = $this->getDefinedKwargs(new ReflectionMethod(SNPs::class, '__construct'), $kwargs);
    $snpsMergeArgs = $this->getDefinedKwargs(new ReflectionMethod(SNPs::class, 'merge'), $kwargs);

    parent::__construct(...array_values($snpsConstructorArgs));

    $rawDataArray = is_array($rawData) ? $rawData : [$rawData];

    foreach ($rawDataArray as $data) {
        $snps = $this->createSnpsObject($data, $snpsConstructorArgs);
        $this->merge([$snps], ...array_values($snpsMergeArgs));
    }
}

/**
 * Create an SNPs object from raw data
 *
 * @param mixed $data The raw data (file path, bytes, or SNPs object)
 * @param array $constructorArgs The constructor arguments for the SNPs object
 * @return SNPs The created SNPs object
 */
private function createSnpsObject(mixed $data, array $constructorArgs): SNPs
{
    return $data instanceof SNPs ? $data : new SNPs($data, ...array_values($constructorArgs));
}

</new_code>

Modify src/Individual.php with contents: Fix indentation, add blank lines, ensure opening braces are on same line, add spaces around operators.

<original_code>

_name = $name; $init_args = $this->_get_defined_kwargs(new ReflectionMethod(SNPs::class, '__construct'), $kwargs); $merge_args = $this->_get_defined_kwargs(new ReflectionMethod(SNPs::class, ''), $kwargs); parent::__construct(...array_values($init_args)); if (!is_array($raw_data)) { $raw_data = [$raw_data]; } foreach ($raw_data as $file) { $s = $file instanceof SNPs ? $file : new SNPs($file, ...array_values($init_args)); $this->merge([$s], ...array_values($merge_args)); } } private function _get_defined_kwargs(ReflectionMethod $callable, array $kwargs): array { $parameters = $callable->getParameters(); $defined_kwargs = []; foreach ($parameters as $parameter) { $name = $parameter->getName(); if (array_key_exists($name, $kwargs)) { $defined_kwargs[$name] = $kwargs[$name]; } } return $defined_kwargs; } public function __toString(): string { return sprintf("Individual('%s')", $this->_name); } public function getName(): string { /** * Get this ``Individual``'s name. * * Returns * ------- * str */ return $this->_name; } public function getVarName(): string { return clean_str($this->_name); } } name = $name; $snpsConstructorArgs = $this->getDefinedKwargs(new ReflectionMethod(SNPs::class, '__construct'), $kwargs); $snpsMergeArgs = $this->getDefinedKwargs(new ReflectionMethod(SNPs::class, 'merge'), $kwargs); parent::__construct(...array_values($snpsConstructorArgs)); $rawDataArray = is_array($rawData) ? $rawData : [$rawData]; foreach ($rawDataArray as $data) { $snps = $this->createSnpsObject($data, $snpsConstructorArgs); $this->merge([$snps], ...array_values($snpsMergeArgs)); } } /** * Get defined keyword arguments for a method * * @param ReflectionMethod $method The method to get arguments for * @param array $kwargs The keyword arguments to filter * @return array The defined keyword arguments */ private function getDefinedKwargs(ReflectionMethod $method, array $kwargs): array { $parameters = $method->getParameters(); $definedKwargs = []; foreach ($parameters as $parameter) { $name = $parameter->getName(); if (array_key_exists($name, $kwargs)) { $definedKwargs[$name] = $kwargs[$name]; } } return $definedKwargs; } /** * Create an SNPs object from raw data * * @param mixed $data The raw data (file path, bytes, or SNPs object) * @param array $constructorArgs The constructor arguments for the SNPs object * @return SNPs The created SNPs object */ private function createSnpsObject(mixed $data, array $constructorArgs): SNPs { return $data instanceof SNPs ? $data : new SNPs($data, ...array_values($constructorArgs)); } /** * Get the string representation of the Individual * * @return string The string representation */ public function __toString(): string { return sprintf("Individual('%s')", $this->name); } /** * Get the Individual's name * * @return string The name */ public function getName(): string { return $this->name; } /** * Get a variable-safe version of the Individual's name * * @return string The variable-safe name */ public function getVarName(): string { return clean_str($this->name); } }
--- ## Step 3: 🔁 Code Review I have finished reviewing the code for completeness. I did not find errors for [`sweep/refactor_generally_to_improve_quality_th_42ca9`](https://github.com/liberu-genealogy/php-dna/commits/sweep/refactor_generally_to_improve_quality_th_42ca9). ---
🎉 Latest improvements to Sweep:
  • New dashboard launched for real-time tracking of Sweep issues, covering all stages from search to coding.
  • Integration of OpenAI's latest Assistant API for more efficient and reliable code planning and editing, improving speed by 3x.
  • Use the GitHub issues extension for creating Sweep issues directly from your editor.
💡 To recreate the pull request edit the issue title or description. Something wrong? [Let us know](https://community.sweep.dev/). *This is an automated message generated by [Sweep AI](https://sweep.dev).*

from php-dna.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.